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Systems of Units. Some Important Conversion Factors 


The most important systems of units are shown in the table below. The mks system is also known as 
the International System of Units (abbreviated SI), and the abbreviations sec (instead of s), 
gm (instead of g), and nt (instead of N) are also used. 


System of units Length Mass Time Force 
cgs system centimeter (cm) gram (g) second (s) dyne 
mks system meter (m) kilogram (kg) second (s) newton (nt) 
Engineering system foot (ft) slug second (s) pound (Ib) 
1 inch (in.) = 2.540000 cm 1 foot (ft) = 12 in. = 30.480000 cm 
1 yard (yd) = 3 ft = 91.440000 cm 1 statute mile (mi) = 5280 ft = 1.609344 km 


1 nautical mile = 6080 ft = 1.853184 km 
1 acre = 4840 yd? = 4046.8564 m? 1 mi? = 640 acres = 2.5899881 km? 


1 fluid ounce = 1/128 U.S. gallon = 231/128 in.? = 29.573730 cm? 


1 US. gallon = 4 quarts (liq) = 8 pints (liq) = 128 fl oz = 3785.4118 cm? 

1 British Imperial and Canadian gallon = 1.200949 U.S. gallons = 4546.087 cm? 
1 slug = 14.59390 kg 

1 pound (Ib) = 4.448444 nt 1 newton (nt) = 10° dynes 

1 British thermal unit (Btu) = 1054.35 joules 1 joule = 107 ergs 

1 calorie (cal) = 4.1840 joules 

1 kilowatt-hour (kWh) = 3414.4 Btu = 3.6 - 10° joules 

1 horsepower (hp) = 2542.48 Btu/h = 178.298 cal/sec = 0.74570 kW 


1 kilowatt (kW) = 1000 watts = 3414.43 Btu/h = 238.662 cal/s 


°F = °C - 1.8 + 32 1° = 60’ = 3600” = 0.017453293 radian 


For further details see, for example, D. Halliday, R. Resnick, and J. Walker, Fundamentals of Physics. 9th ed., Hoboken, 
N. J: Wiley, 2011. See also AN American National Standard, ASTM/IEEE Standard Metric Practice, Institute of Electrical and 
Electronics Engineers, Inc. (IEEE), 445 Hoes Lane, Piscataway, N. J. 08854, website at www.ieee.org. 
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Integration 
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Purpose and Structure of the Book 


This book provides a comprehensive, thorough, and up-to-date treatment of engineering 
mathematics. It is intended to introduce students of engineering, physics, mathematics, 
computer science, and related fields to those areas of applied mathematics that are most 
relevant for solving practical problems. A course in elementary calculus is the sole 
prerequisite. (However, a concise refresher of basic calculus for the student is included 
on the inside cover and in Appendix 3.) 


The subject matter is arranged into seven parts as follows: 


. Ordinary Differential Equations (ODEs) in Chapters 1-6 
. Linear Algebra. Vector Calculus. See Chapters 7-10 
. Fourier Analysis. Partial Differential Equations (PDEs). See Chapters 11 and 12 
. Complex Analysis in Chapters 13-18 
Numeric Analysis in Chapters 19-21 
Optimization, Graphs in Chapters 22 and 23 
. Probability, Statistics in Chapters 24 and 25. 


OmMN™"OO WP > 


These are followed by five appendices: 1. References, 2. Answers to Odd-Numbered 
Problems, 3. Auxiliary Materials (see also inside covers of book), 4. Additional Proofs, 
5. Table of Functions. This is shown in a block diagram on the next page. 

The parts of the book are kept independent. In addition, individual chapters are kept as 
independent as possible. (If so needed, any prerequisites—to the level of individual 
sections of prior chapters—are clearly stated at the opening of each chapter.) We give the 
instructor maximum flexibility in selecting the material and tailoring it to his or her 
need. The book has helped to pave the way for the present development of engineering 
mathematics. This new edition will prepare the student for the current tasks and the future 
by a modern approach to the areas listed above. We provide the material and learning 
tools for the students to get a good foundation of engineering mathematics that will help 
them in their careers and in further studies. 


General Features of the Book Include: 


Simplicity of examples to make the book teachable—why choose complicated 
examples when simple ones are as instructive or even better? 


Independence of parts and blocks of chapters to provide flexibility in tailoring 
courses to specific needs. 


Self-contained presentation, except for a few clearly marked places where a proof 
would exceed the level of the book and a reference is given instead. 


Gradual increase in difficulty of material with no jumps or gaps to ensure an 
enjoyable teaching and learning experience. 


Modern standard notation to help students with other courses, modern books, and 
journals in mathematics, engineering, statistics, physics, computer science, and others. 


Furthermore, we designed the book to be a single, self-contained, authoritative, and 
convenient source for studying and teaching applied mathematics, eliminating the need 
for time-consuming searches on the Internet or time-consuming trips to the library to get 
a particular reference book. 
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Four Underlying Themes of the Book 


The driving force in engineering mathematics is the rapid growth of technology and the 
sciences. New areas—often drawing from several disciplines—come into existence. 
Electric cars, solar energy, wind energy, green manufacturing, nanotechnology, risk 
management, biotechnology, biomedical engineering, computer vision, robotics, space 
travel, communication systems, green logistics, transportation systems, financial 
engineering, economics, and many other areas are advancing rapidly. What does this mean 
for engineering mathematics? The engineer has to take a problem from any diverse area 
and be able to model it. This leads to the first of four underlying themes of the book. 


1. Modeling is the process in engineering, physics, computer science, biology, 
chemistry, environmental science, economics, and other fields whereby a physical situation 
or some other observation is translated into a mathematical model. This mathematical 
model could be a system of differential equations, such as in population control (Sec. 4.5), 
a probabilistic model (Chap. 24), such as in risk management, a linear programming 
problem (Secs. 22.2—22.4) in minimizing environmental damage due to pollutants, a 
financial problem of valuing a bond leading to an algebraic equation that has to be solved 
by Newton’s method (Sec. 19.2), and many others. 

The next step is solving the mathematical problem obtained by one of the many 
techniques covered in Advanced Engineering Mathematics. 

The third step is interpreting the mathematical result in physical or other terms to 
see what it means in practice and any implications. 

Finally, we may have to make a decision that may be of an industrial nature or 
recommend a public policy. For example, the population control model may imply 
the policy to stop fishing for 3 years. Or the valuation of the bond may lead to a 
recommendation to buy. The variety is endless, but the underlying mathematics is 
surprisingly powerful and able to provide advice leading to the achievement of goals 
toward the betterment of society, for example, by recommending wise policies 
concerning global warming, better allocation of resources in a manufacturing process, 
or making statistical decisions (such as in Sec. 25.4 whether a drug is effective in treating 
a disease). 

While we cannot predict what the future holds, we do know that the student has to 
practice modeling by being given problems from many different applications as is done 
in this book. We teach modeling from scratch, right in Sec. 1.1, and give many examples 
in Sec. 1.3, and continue to reinforce the modeling process throughout the book. 


2. Judicious use of powerful software for numerics (listed in the beginning of Part E) 
and statistics (Part G) is of growing importance. Projects in engineering and industrial 
companies may involve large problems of modeling very complex systems with hundreds 
of thousands of equations or even more. They require the use of such software. However, 
our policy has always been to leave it up to the instructor to determine the degree of use of 
computers, from none or little use to extensive use. More on this below. 


3. The beauty of engineering mathematics. Engineering mathematics relies on 
relatively few basic concepts and involves powerful unifying principles. We point them 
out whenever they are clearly visible, such as in Sec. 4.1 where we “grow” a mixing 
problem from one tank to two tanks and a circuit problem from one circuit to two circuits, 
thereby also increasing the number of ODEs from one ODE to two ODEs. This is an 
example of an attractive mathematical model because the “growth” in the problem is 
reflected by an “increase” in ODEs. 
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4. To clearly identify the conceptual structure of subject matters. For example, 
complex analysis (in Part D) is a field that is not monolithic in structure but was formed 
by three distinct schools of mathematics. Each gave a different approach, which we clearly 
mark. The first approach is solving complex integrals by Cauchy’s integral formula (Chaps. 
13 and 14), the second approach is to use the Laurent series and solve complex integrals 
by residue integration (Chaps. 15 and 16), and finally we use a geometric approach of 
conformal mapping to solve boundary value problems (Chaps. 17 and 18). Learning the 
conceptual structure and terminology of the different areas of engineering mathematics is 
very important for three reasons: 

a. It allows the student to identify a new problem and put it into the right group of 
problems. The areas of engineering mathematics are growing but most often retain their 
conceptual structure. 

b. The student can absorb new information more rapidly by being able to fit it into the 
conceptual structure. 

c. Knowledge of the conceptual structure and terminology is also important when using 
the Internet to search for mathematical information. Since the search proceeds by putting 
in key words (i.e., terms) into the search engine, the student has to remember the important 
concepts (or be able to look them up in the book) that identify the application and area 
of engineering mathematics. 


Big Changes in This Edition 


@ Problem Sets Changed 

The problem sets have been revised and rebalanced with some problem sets having more 
problems and some less, reflecting changes in engineering mathematics. There is a greater 
emphasis on modeling. Now there are also problems on the discrete Fourier transform 
(in Sec. 11.9). 


2) Series Solutions of ODEs, Special Functions and Fourier Analysis Reorganized 
Chap. 5, on series solutions of ODEs and special functions, has been shortened. Chap. 11 
on Fourier Analysis now contains Sturm—Liouville problems, orthogonal functions, and 
orthogonal eigenfunction expansions (Secs. 11.5, 11.6), where they fit better conceptually 
(rather than in Chap. 5), being extensions of Fourier’s idea of using orthogonal functions. 


3] Openings of Parts and Chapters Rewritten As Well As Parts of Sections 

In order to give the student a better idea of the structure of the material (see Underlying 
Theme 4 above), we have entirely rewritten the openings of parts and chapters. 
Furthermore, large parts or individual paragraphs of sections have been rewritten or new 
sentences inserted into the text. This should give the students a better intuitive 
understanding of the material (see Theme 3 above), let them draw conclusions on their 
own, and be able to tackle more advanced material. Overall, we feel that the book has 
become more detailed and leisurely written. 


4] Student Solutions Manual and Study Guide Enlarged 

Upon the explicit request of the users, the answers provided are more detailed and 
complete. More explanations are given on how to learn the material effectively by pointing 
out what is most important. 


© More Historical Footnotes, Some Enlarged 
Historical footnotes are there to show the student that many people from different countries 
working in different professions, such as surveyors, researchers in industry, etc., contributed 
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to the field of engineering mathematics. It should encourage the students to be creative in 
their own interests and careers and perhaps also to make contributions to engineering 
mathematics. 


Further Changes and New Features 


Parts of Chap. | on first-order ODEs are rewritten. More emphasis on modeling, also 
new block diagram explaining this concept in Sec. 1.1. Early introduction of Euler’s 
method in Sec. 1.2 to familiarize student with basic numerics. More examples of 
separable ODEs in Sec. 1.3. 


For Chap. 2, on second-order ODEs, note the following changes: For ease of reading, 
the first part of Sec. 2.4, which deals with setting up the mass-spring system, has 
been rewritten; also some rewriting in Sec. 2.5 on the Euler—-Cauchy equation. 


Substantially shortened Chap. 5, Series Solutions of ODEs. Special Functions: 
combined Secs. 5.1 and 5.2 into one section called “Power Series Method,” shortened 
material in Sec. 5.4 Bessel’s Equation (of the first kind), removed Sec. 5.7 
(Sturm—Liouville Problems) and Sec. 5.8 (Orthogonal Eigenfunction Expansions) and 
moved material into Chap. 11 (see “Major Changes” above). 


New equivalent definition of basis (Sec. 7.4). 


In Sec. 7.9, completely new part on composition of linear transformations with 
two new examples. Also, more detailed explanation of the role of axioms, in 
connection with the definition of vector space. 


New table of orientation (opening of Chap. 8 “Linear Algebra: Matrix Eigenvalue 
Problems”) where eigenvalue problems occur in the book. More intuitive explanation 
of what an eigenvalue is at the begining of Sec. 8.1. 


Better definition of cross product (in vector differential calculus) by properly 
identifying the degenerate case (in Sec. 9.3). 


Chap. 11 on Fourier Analysis extensively rearranged: Secs. 11.2 and 11.3 
combined into one section (Sec. 11.2), old Sec. 11.4 on complex Fourier Series 
removed and new Secs. 11.5 (Sturm—Liouville Problems) and 11.6 (Orthogonal 
Series) put in (see “Major Changes” above). New problems (new!) in problem set 
11.9 on discrete Fourier transform. 


New section 12.5 on modeling heat flow from a body in space by setting up the heat 
equation. Modeling PDEs is more difficult so we separated the modeling process 
from the solving process (in Sec. 12.6). 


Introduction to Numerics rewritten for greater clarity and better presentation; new 
Example | on how to round a number. Sec. 19.3 on interpolation shortened by 
removing the less important central difference formula and giving a reference instead. 


Large new footnote with historical details in Sec. 22.3, honoring George Dantzig, 
the inventor of the simplex method. 


Traveling salesman problem now described better as a “difficult” problem, typical 
of combinatorial optimization (in Sec. 23.2). More careful explanation on how to 
compute the capacity of a cut set in Sec. 23.6 (Flows on Networks). 


In Chap. 24, material on data representation and characterization restructured in 
terms of five examples and enlarged to include empirical rule on distribution of 
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data, outliers, and the z-score (Sec. 24.1). Furthermore, new example on encription 
(Sec. 24.4). 


e Lists of software for numerics (Part E) and statistics (Part G) updated. 


e References in Appendix 1 updated to include new editions and some references to 
websites. 


Use of Computers 


The presentation in this book is adaptable to various degrees of use of software, 
Computer Algebra Systems (CAS’s), or programmable graphic calculators, ranging 
from no use, very little use, medium use, to intensive use of such technology. The choice 
of how much computer content the course should have is left up to the instructor, thereby 
exhibiting our philosophy of maximum flexibility and adaptability. And, no matter what 
the instructor decides, there will be no gaps or jumps in the text or problem set. Some 
problems are clearly designed as routine and drill exercises and should be solved by 
hand (paper and pencil, or typing on your computer). Other problems require more 
thinking and can also be solved without computers. Then there are problems where the 
computer can give the student a hand. And finally, the book has CAS projects, CAS 
problems and CAS experiments, which do require a computer, and show its power in 
solving problems that are difficult or impossible to access otherwise. Here our goal is 
to combine intelligent computer use with high-quality mathematics. The computer 
invites visualization, experimentation, and independent discovery work. In summary, 
the high degree of flexibility of computer use for the book is possible since there are 
plenty of problems to choose from and the CAS problems can be omitted if desired. 

Note that information on software (what is available and where to order it) is at the 
beginning of Part E on Numeric Analysis and Part G on Probability and Statistics. Since 
Maple and Mathematica are popular Computer Algebra Systems, there are two computer 
guides available that are specifically tailored to Advanced Engineering Mathematics: 
E. Kreyszig and E.J. Norminton, Maple Computer Guide, 10th Edition and Mathematica 
Computer Guide, \Oth Edition. Their use is completely optional as the text in the book is 
written without the guides in mind. 


Suggestions for Courses: A Four-Semester Sequence 


The material, when taken in sequence, is suitable for four consecutive semester courses, 
meeting 3 to 4 hours a week: 


1st Semester ODEs (Chaps. 1-5 or 1-6) 

2nd Semester Linear Algebra. Vector Analysis (Chaps. 7-10) 
3rd Semester Complex Analysis (Chaps. 13-18) 

4th Semester Numeric Methods (Chaps. 19-21) 


Suggestions for Independent One-Semester Courses 


The book is also suitable for various independent one-semester courses meeting 3 hours 
a week. For instance, 


Introduction to ODEs (Chaps. 1-2, 21.1) 
Laplace Transforms (Chap. 6) 
Matrices and Linear Systems (Chaps. 7-8) 
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Vector Algebra and Calculus (Chaps. 9-10) 

Fourier Series and PDEs (Chaps. 11-12, Secs. 21.4—21.7) 
Introduction to Complex Analysis (Chaps. 13-17) 
Numeric Analysis (Chaps. 19, 21) 

Numeric Linear Algebra (Chap. 20) 

Optimization (Chaps. 22—23) 

Graphs and Combinatorial Optimization (Chap. 23) 
Probability and Statistics (Chaps. 24—25) 
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PART A 


Ordinary 
Differential 
Equations (ODEs) 


CHAPTER 1 First-Order ODEs 

CHAPTER 2. Second-Order Linear ODEs 

CHAPTER 3 _ Higher Order Linear ODEs 

CHAPTER 4 _ Systems of ODEs. Phase Plane. Qualitative Methods 
CHAPTER 5. Series Solutions of ODEs. Special Functions 
CHAPTER 6 Laplace Transforms 


Many physical laws and relations can be expressed mathematically in the form of differential 
equations. Thus it is natural that this book opens with the study of differential equations and 
their solutions. Indeed, many engineering problems appear as differential equations. 


The main objectives of Part A are twofold: the study of ordinary differential equations 
and their most important methods for solving them and the study of modeling. 


Ordinary differential equations (ODEs) are differential equations that depend on a single 
variable. The more difficult study of partial differential equations (PDEs), that is, 
differential equations that depend on several variables, is covered in Part C. 


Modeling is a crucial general process in engineering, physics, computer science, biology, 
medicine, environmental science, chemistry, economics, and other fields that translates a 
physical situation or some other observations into a “mathematical model.” Numerous 
examples from engineering (e.g., mixing problem), physics (e.g., Newton’s law of cooling), 
biology (e.g., Gompertz model), chemistry (e.g., radiocarbon dating), environmental science 
(e.g., population control), etc. shall be given, whereby this process is explained in detail, 
that is, how to set up the problems correctly in terms of differential equations. 


For those interested in solving ODEs numerically on the computer, look at Secs. 21.1—21.3 
of Chapter 21 of Part F, that is, numeric methods for ODEs. These sections are kept 
independent by design of the other sections on numerics. This allows for the study of 
numerics for ODEs directly after Chap. 1 or 2. 


CHAPTER | 


First-Order ODEs 


Chapter | begins the study of ordinary differential equations (ODEs) by deriving them from 
physical or other problems (modeling), solving them by standard mathematical methods, 
and interpreting solutions and their graphs in terms of a given problem. The simplest ODEs 
to be discussed are ODEs of the first order because they involve only the first derivative 
of the unknown function and no higher derivatives. These unknown functions will usually 
be denoted by y(x) or y(t) when the independent variable denotes time ¢. The chapter ends 
with a study of the existence and uniqueness of solutions of ODEs in Sec. 1.7. 

Understanding the basics of ODEs requires solving problems by hand (paper and pencil, 
or typing on your computer, but first without the aid of a CAS). In doing so, you will 
gain an important conceptual understanding and feel for the basic terms, such as ODEs, 
direction field, and initial value problem. If you wish, you can use your Computer Algebra 
System (CAS) for checking solutions. 


COMMENT. Numerics for first-order ODEs can be studied immediately after this 
chapter. See Secs. 21.1—21.2, which are independent of other sections on numerics. 


Prerequisite: Integral calculus. 
Sections that may be omitted in a shorter course: 1.6, 1.7. 
References and Answers to Problems: App. | Part A, and App. 2. 
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Fig. 1. Modeling, 
solving, interpreting 
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If we want to solve an engineering problem (usually of a physical nature), we first 
have to formulate the problem as a mathematical expression in terms of variables, 
functions, and equations. Such an expression is known as a mathematical model of the 
given problem. The process of setting up a model, solving it mathematically, and 
interpreting the result in physical or other terms is called mathematical modeling or, 
briefly, modeling. 

Modeling needs experience, which we shall gain by discussing various examples and 
problems. (Your computer may often help you in solving but rarely in setting up models.) 

Now many physical concepts, such as velocity and acceleration, are derivatives. Hence 
a model is very often an equation containing derivatives of an unknown function. Such 
a model is called a differential equation. Of course, we then want to find a solution (a 
function that satisfies the equation), explore its properties, graph it, find values of it, and 
interpret it in physical terms so that we can understand the behavior of the physical system 
in our given problem. However, before we can turn to methods of solution, we must first 
define some basic concepts needed throughout this chapter. 
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Fig. 2. Some applications of differential equations 


An ordinary differential equation (ODE) is an equation that contains one or several 
derivatives of an unknown function, which we usually call y(x) (or sometimes y(f) if the 
independent variable is time ft). The equation may also contain y itself, known functions 
of x (or #), and constants. For example, 


(1) y’ = cos x 
(2) Pye 


(3) yly”" _ By!2 =i 


EXAMPLE 1 


CHAP. 1 First-Order ODEs 


are ordinary differential equations (ODEs). Here, as in calculus, y’ denotes dy/dx, 


= dy/dx?, etc. The term ordinary distinguishes them from partial differential 


equations (PDEs), which involve partial derivatives of an unknown function of two 
or more variables. For instance, a PDE with unknown function u of two variables x 
and y is 


PDEs have important engineering applications, but they are more complicated than ODEs; 
they will be considered in Chap. 12. 

An ODE is said to be of order n if the nth derivative of the unknown function y is the 
highest derivative of y in the equation. The concept of order gives a useful classification 
into ODEs of first order, second order, and so on. Thus, (1) is of first order, (2) of second 
order, and (3) of third order. 

In this chapter we shall consider first-order ODEs. Such equations contain only the 
first derivative y’ and may contain y and any given functions of x. Hence we can write 
them as 


(4) F(x, y,y') = 0 
or often in the form 
y =f@y). 


This is called the explicit form, in contrast to the implicit form (4). For instance, the implicit 
ODE x 3y! = Ay? = 0 (where x # 0) can be written explicitly as y’ = 4x3 y?. 


Concept of Solution 


A function 
y = h(x) 


is called a solution of a given ODE (4) on some open interval a < x < b if A(x) is 
defined and differentiable throughout the interval and is such that the equation becomes 
an identity if y and y’ are replaced with h and h’, respectively. The curve (the graph) of 
h is called a solution curve. 

Here, open interval a < x < b means that the endpoints a and b are not regarded as 
points belonging to the interval. Also,a < x < bincludes infinite intervals -~ < x < b, 
a<x< ©,-% < x < © (the real line) as special cases. 


Verification of Solution 


Verify that y = c/x (c an arbitrary constant) is a solution of the ODE xy’ = —y for all x # 0. Indeed, differentiate 
y =c/x to get y’ = —c/x?. Multiply this by x, obtaining xy’ = —c/x; thus, xy’ = —y, the given ODE. a 
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EXAMPLE 3 


Solution by Calculus. Solution Curves 


The ODE y’ = dy/dx = cos x can be solved directly by integration on both sides. Indeed, using calculus, 
we obtain y = fcos x dx = sinx + c, where c is an arbitrary constant. This is a family of solutions. Each value 
of c, for instance, 2.75 or 0 or —8, gives one of these curves. Figure 3 shows some of them, for c = —3, 


=1,10, 1,.2,3,.4. 


Fig.3. Solutions y = sinx + c of the ODE y’ = cos x 


(A) Exponential Growth. (B) Exponential Decay 


From calculus we know that y = ce®** has the derivative 


1_¥ 


y= a = 0.2097! = 0.2y. 


—2, 


Hence y is a solution of y’ = 0.2y (Fig. 4A). This ODE is of the form y’ = ky. With positive-constant k it can 
model exponential growth, for instance, of colonies of bacteria or populations of animals. It also applies to 
humans for small populations in a large country (e.g., the United States in early times) and is then known as 


Malthus’s law.’ We shall say more about this topic in Sec. 1.5. 


(B) Similarly, y’ = —0.2 (with a minus on the right) has the solution y = ce~°?“, (Fig. 4B) modeling 


exponential decay, as, for instance, of a radioactive substance (see Example 5). 


2.5 
2.0 

1.5 

1.0 

0.5 —— 


Fig. 4A. Solutions of y’ = 0.2y 
in Example 3 (exponential growth) 


Fig. 4B. Solutions of y’ = —0.2y 
in Example 3 (exponential decay) 


1Named after the English pioneer in classic economics, THOMAS ROBERT MALTHUS (1766-1834). 
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We see that each ODE in these examples has a solution that contains an arbitrary 
constant c. Such a solution containing an arbitrary constant c is called a general solution 
of the ODE. 

(We shall see that c is sometimes not completely arbitrary but must be restricted to some 
interval to avoid complex expressions in the solution.) 

We shall develop methods that will give general solutions uniquely (perhaps except for 
notation). Hence we shall say the general solution of a given ODE (instead of a general 
solution). 

Geometrically, the general solution of an ODE is a family of infinitely many solution 
curves, one for each value of the constant c. If we choose a specific c (e.g., c = 6.45 or 0 
or —2.01) we obtain what is called a particular solution of the ODE. A particular solution 
does not contain any arbitrary constants. 

In most cases, general solutions exist, and every solution not containing an arbitrary 
constant is obtained as a particular solution by assigning a suitable value to c. Exceptions 
to these rules occur but are of minor interest in applications; see Prob. 16 in Problem 
Set 1.1. 


Initial Value Problem 


In most cases the unique solution of a given problem, hence a particular solution, is 
obtained from a general solution by an initial condition y(x9) = yo, with given values 
Xg and yo, that is used to determine a value of the arbitrary constant c. Geometrically 
this condition means that the solution curve should pass through the point (x9, yo) 
in the xy-plane. An ODE, together with an initial condition, is called an initial value 
problem. Thus, if the ODE is explicit, y’ = f(x, y), the initial value problem is of 
the form 


(5) y =f, y), y(xo) = yo. 


Initial Value Problem 


Solve the initial value problem 


,_ dy 
y Se 3); y(0) = 5.7. 
dx 


Solution. The general solution is y(x) = ce*”; see Example 3. From this solution and the initial condition 
we obtain y(0) = ce° = c = 5.7. Hence the initial value problem has the solution y(x) = 5.7e?”. This is a 
particular solution. | 


More on Modeling 


The general importance of modeling to the engineer and physicist was emphasized at the 
beginning of this section. We shall now consider a basic physical problem that will show 
the details of the typical steps of modeling. Step 1: the transition from the physical situation 
(the physical system) to its mathematical formulation (its mathematical model); Step 2: 
the solution by a mathematical method; and Step 3: the physical interpretation of the result. 
This may be the easiest way to obtain a first idea of the nature and purpose of differential 
equations and their applications. Realize at the outset that your computer (your CAS) 
may perhaps give you a hand in Step 2, but Steps 1 and 3 are basically your work. 
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EXAMPLE 5 


And Step 2 requires a solid knowledge and good understanding of solution methods 
available to you—you have to choose the method for your work by hand or by the 
computer. Keep this in mind, and always check computer results for errors (which may 
arise, for instance, from false inputs). 


Radioactivity. Exponential Decay 


Given an amount of a radioactive substance, say, 0.5 g (gram), find the amount present at any later time. 
Physical Information. Experiments show that at each instant a radioactive substance decomposes—and is thus 
decaying in time—proportional to the amount of substance present. 


Step 1. Setting up a mathematical model of the physical process. Denote by y(t) the amount of substance still 
present at any time ¢. By the physical law, the time rate of change y’(t) = dy/dt is proportional to y(t). This 
gives the first-order ODE 


dy 


(6) Ge —ky 


where the constant k is positive, so that, because of the minus, we do get decay (as in [B] of Example 3). 
The value of k is known from experiments for various radioactive substances (e.g., k = 1.4 + 1071! sec, 
approximately, for radium 226 Ra). 

Now the given initial amount is 0.5 g, and we can call the corresponding instant t = 0. Then we have the 
initial condition y(O) = 0.5. This is the instant at which our observation of the process begins. It motivates 
the term initial condition (which, however, is also used when the independent variable is not time or when 
we choose a ¢ other than t = 0). Hence the mathematical model of the physical process is the initial value 
problem 


dy 
dt 


(7) —ky, (0) = 0.5. 


Step 2. Mathematical solution. As in (B) of Example 3 we conclude that the ODE (6) models exponential decay 


and has the general solution (with arbitrary constant c but definite given k) 


(8) y(t) = ce, 


We now determine c by using the initial condition. Since y(0) = c from (8), this gives y(0) = c = 0.5. Hence 
the particular solution governing our process is (cf. Fig. 5) 


(9) y(t) = 0.5e7"* (k > 0). 


Always check your result—it may involve human or computer errors! Verify by differentiation (chain rule!) 
that your solution (9) satisfies (7) as well as y(0) = 0.5: 


dy 
dt 


0.5ke~** k+0.5e7"* = —ky, (0) = 0.5e° = 0.5. 


Step 3. Interpretation of result. Formula (9) gives the amount of radioactive substance at time f. It starts from 
the correct initial amount and decreases with time because k is positive. The limit of y as t—> co is zero. & 


(0) ! ! ! 
0 0.5 1 1.5 2 2.5 3 t 


Fig. 5. Radioactivity (Exponential decay, 
y = 05e “, with k = 1.5 as an example) 
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PROBLEM SEF 1-1 
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CALCULUS 


Solve the ODE by integration or by remembering a 
differentiation formula. 


1. y’ + 2 sin 2ax =0 

2. y' + xe" 2 = 0 

3. y’ =y 

4, y' = -1.5y 

5. y’ = 4e-“ cos x 

Gy = -¥ 

7. y’ = cosh 5.13x 

i. =" 

9-15| VERIFICATION. INITIAL VALUE 


PROBLEM (IVP) 


(a) Verify that y is a solution of the ODE. (b) Determine 
from y the particular solution of the IVP. (c) Graph the 
solution of the IVP. 


9, y' +4y=1.4, y =ce*” + 0.35, y(0) =2 
10. y’) + 5xy=0, y=ce 2", yO) =a 

11. y! =yte”, y=(+ce”, y(0) =5 

12. yy =4x, y?-4x%7=cQ > 0), yl) =4 

13. y' =y—y, y= Tee (0) = 0.25 

14, y'tanx =2y—-8, y=csin?x +4, yb) =0 


15. 


16. 


Find two constant solutions of the ODE in Prob. 13 by 
inspection. 

Singular solution. An ODE may sometimes have an 
additional solution that cannot be obtained from the 
general solution and is then called a singular solution. 
The ODE y’2 — xy’ + y = 0 is of this kind. Show 
by differentiation and substitution that it has the 
general solution y = cx — c? and the singular solution 
y= x], Explain Fig. 6. 


Fig. 6. Particular solutions and singular 


solution in Problem 16 


17-20 


MODELING, APPLICATIONS 


These problems will give you a first impression of modeling. 
Many more problems on modeling follow throughout this 
chapter. 


17. 


18. 


19. 


20. 


Half-life. The half-life measures exponential decay. 
It is the time in which half of the given amount of 
radioactive substance will disappear. What is the half- 
life of #28Ra (in years) in Example 5? 


Half-life. Radium 733Ra has a_ half-life of about 
3.6 days. 

(a) Given | gram, how much will still be present after 
1 day? 

(b) After 1 year? 


Free fall. In dropping a stone or an iron ball, air 
resistance is practically negligible. Experiments 
show that the acceleration of the motion is constant 
(equal to g = 9.80 m/sec” = 32 ft/sec”,called the 
acceleration of gravity). Model this as an ODE for 
y(t), the distance fallen as a function of time ¢. If the 
motion starts at time t = 0 from rest (i.e., with velocity 
v = y’ = 0), show that you obtain the familiar law of 
free fall 


y = 280”. 


Exponential decay. Subsonic flight. The efficiency 
of the engines of subsonic airplanes depends on air 
pressure and is usually maximum near 35,000 ft. 
Find the air pressure y(x) at this height. Physical 
information. The rate of change y'(x) is proportional 
to the pressure. At 18,000 ft it is half its value 
Yo = y(O) at sea level. Hint. Remember from calculus 
that if y = e**, then y’ = ke“ = ky. Can you see 
without calculation that the answer should be close 
to Yo/' 4? 
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|.2 Geometric Meaning of y’ = f(x, y). 
Direction Fields, Euler’s Method 


A first-order ODE 
(1) y’ = f(x,y) 


has a simple geometric interpretation. From calculus you know that the derivative y’(x) of 
y(x) is the slope of y(x). Hence a solution curve of (1) that passes through a point (x9, yo) 
must have, at that point, the slope y’(x) equal to the value of f at that point; that is, 


y (xo) = f(Xo. Yo). 


Using this fact, we can develop graphic or numeric methods for obtaining approximate 
solutions of ODEs (1). This will lead to a better conceptual understanding of an ODE (1). 
Moreover, such methods are of practical importance since many ODEs have complicated 
solution formulas or no solution formulas at all, whereby numeric methods are needed. 


Graphic Method of Direction Fields. Practical Example Illustrated in Fig. 7. We 
can show directions of solution curves of a given ODE (1) by drawing short straight-line 
segments (lineal elements) in the xy-plane. This gives a direction field (or slope field) 
into which you can then fit (approximate) solution curves. This may reveal typical 
properties of the whole family of solutions. 

Figure 7 shows a direction field for the ODE 


(2) y =y+z 


obtained by a CAS (Computer Algebra System) and some approximate solution curves 
fitted in. 


y 
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Fig. 7. Direction field of y’ = y + x, with three approximate solution 
curves passing through (0, 1), (0, 0), (0, —1), respectively 


CHAP. 1 First-Order ODEs 


If you have no CAS, first draw a few level curves f(x, y) = const of f(x, y), then parallel 
lineal elements along each such curve (which is also called an isocline, meaning a curve 
of equal inclination), and finally draw approximation curves fit to the lineal elements. 

We shall now illustrate how numeric methods work by applying the simplest numeric 
method, that is Euler’s method, to an initial value problem involving ODE (2). First we 
give a brief description of Euler’s method. 


Numeric Method by Euler 


Given an ODE (1) and an initial value y(xo) = yo, Euler’s method yields approximate 
solution values at equidistant x-values x9, x1 = X9 + h,xg = X09 + 2h,---, namely, 


y1 = yo + hf(xo, Yo) (Fig. 8) 


yo =y1 + hf(x1,yp), ete. 
In general, 
Yn = Yn-1 + hf(Xn-1, Yn—V 


where the step / equals, e.g., 0.1 or 0.2 (as in Table 1.1) or a smaller value for greater 
accuracy. 


Solution curve 


y(x,) 


r Error of y, 


yy 


> hflx 9, Vo) 


Yo 


1 x 


Fig. 8. First Euler step, showing a solution curve, its tangent at (Xo, Yo), 
step h and increment hf(Xo, Yo) in the formula for y, 


Table 1.1 shows the computation of n = 5 steps with step h = 0.2 for the ODE (2) and 
initial condition y(0) = 0, corresponding to the middle curve in the direction field. We 
shall solve the ODE exactly in Sec. 1.5. For the time being, verify that the initial value 
problem has the solution y = e” — x — 1. The solution curve and the values in Table 1.1 
are shown in Fig. 9. These values are rather inaccurate. The errors y(x,) — yn are shown 
in Table 1.1 as well as in Fig. 9. Decreasing h would improve the values, but would soon 
require an impractical amount of computation. Much better methods of a similar nature 
will be discussed in Sec. 21.1. 
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Table 1.1. Euler method for y’ = y + x, y(0) = 0 for 
x = 0, ---,1.0 with step h = 0.2 


n Xn Yn yxy) Error 
0 0.0 0.000 0.000 0.000 
1 0.2 0.000 0.021 0.021 
2 0.4 0.04 0.092 0.052 
3 0.6 0.128 0.222 0.094 
4 0.8 0.274 0.426 0.152 
5 1.0 0.488 0.718 0.230 
y 

0.7 

O.5F e 

O.3- 

O.1F ° 

Ld 


‘es 
0 0.2 
Fig. 9. 


DIRECTION FIELDS, SOLUTION CURVES 


Graph a direction field (by a CAS or by hand). In the field 
graph several solution curves by hand, particularly those 
passing through the given points (x, y). 


Ly =1+y%, Ga) 

2. yy’ +4x=0, (1, 1), (0, 2) 

3. y' =1—y7, (0,0), 2,5) 

4. y' =2y—y", (0,0), (0, 1), (0, 2), (0, 3) 
5. y'=x-l1/y, (1.9) 

6. y' =sin?y, (0, —0.4), (0, 1) 

7. y' =e", (2, 2), GB, 3) 

8. y’ = —2xy, (0,5), (0, 1), (, 2) 


9-10 | ACCURACY OF DIRECTION FIELDS 


Direction fields are very useful because they can give you 
an impression of all solutions without solving the ODE, 
which may be difficult or even impossible. To get a feel for 
the accuracy of the method, graph a field, sketch solution 
curves in it, and compare them with the exact solutions. 


9. y'’ = cos mx 

10. y' = —S5y¥/? (Sol. Vy + 3x =) 

11. Autonomous ODE. This means an ODE not showing 
x (the independent variable) explicitly. (The ODEs in 
Probs. 6 and 10 are autonomous.) What will the level 
curves f(x, y) = const (also called isoclines = curves 


! ! 
0.4 0.6 0.8 1 * 


Euler method: Approximate values in Table 1.1 and solution curve 


PROBLEM SET T.2 


of equal inclination) of an autonomous ODE look like? 
Give reason. 


MOTIONS 


Model the motion of a body B on a straight line with 
velocity as given, y(t) being the distance of B from a point 
y = 0 at time ¢. Graph a direction field of the model (the 
ODE). In the field sketch the solution curve satisfying the 
given initial condition. 


12. Product of velocity times distance constant, equal to 2, 
yO) = 2. 


13. Distance = Velocity X Time, y1)=1 


14. Square of the distance plus square of the velocity equal 
to I, initial distance 1/V2 


15. Parachutist. Two forces act on a_ parachutist, the 
attraction by the earth mg (m = mass of person plus 
equipment, g = 9.8 m/ sec” the acceleration of gravity) 
and the air resistance, assumed to be proportional to the 
square of the velocity u(f). Using Newton’s second law 
of motion (mass X acceleration = resultant of the forces), 
set up a model (an ODE for v(f)). Graph a direction field 
(choosing m and the constant of proportionality equal to 1). 
Assume that the parachute opens when v = 10 m/sec. 
Graph the corresponding solution in the field. What is the 
limiting velocity? Would the parachute still be sufficient 
if the air resistance were only proportional to u(t)? 
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CHAP. 1 First-Order ODEs 


16. CAS PROJECT. Direction Fields. Discuss direction 


fields as follows. 
(a) Graph portions of the direction field of the ODE (2) 


(see Fig. 7), for instance, -5 SxS2,-1SyS5. 


Explain what you have gained by this enlargement of 
the portion of the field. 

(b) Using implicit differentiation, find an ODE with 
the general solution x2 + Oy” =c(y > 0). Graph its 
direction field. Does the field give the impression 
that the solution curves may be semi-ellipses? Can you 
do similar work for circles? Hyperbolas? Parabolas? 
Other curves? 

(c) Make a conjecture about the solutions of y’ = —x/y 
from the direction field. 

(d) Graph the direction field of y’ = —4y and some 
solutions of your choice. How do they behave? Why 
do they decrease for y > 0? 


17-20 | EULER’S METHOD 


This is the simplest method to explain numerically solving 
an ODE, more precisely, an initial value problem (IVP). 
(More accurate methods based on the same principle are 
explained in Sec. 21.1.) Using the method, to get a feel for 
numerics as well as for the nature of IVPs, solve the IVP 
numerically with a PC or a calculator, 10 steps. Graph the 
computed values and the solution curve on the same 
coordinate axes. 


17. y'=y, yO)=1, A=0.1 

18. y'=y, y0)=1, A=0.01 

19. y =(y— x, yO) =0, h=0.1 
Sol. y = x — tanh x 

20. y’ = —5x4y?, (0) 
Sol. y = 1/(1 + x») 


ll 
= 
> 

ll 
= 
tv 


1.3 Separable ODEs. Modeling 


Many practically useful ODEs can be reduced to the form 


(1) gv) y’ = f@) 

by purely algebraic manipulations. Then we can integrate on both sides with respect to x, 
obtaining 

(2) | g(y) y'dx = | (Oe & 


On the left we can switch to y as the variable of integration. By calculus, y’dx = dy, so that 


(3) | 09 dy = [700 dx +c. 


EXAMPLE 1 


If f and g are continuous functions, the integrals in (3) exist, and by evaluating them we 
obtain a general solution of (1). This method of solving ODEs is called the method of 
separating variables, and (1) is called a separable equation, because in (3) the variables 
are now separated: x appears only on the right and y only on the left. 


Separable ODE 


The ODE y’ = 1 + y is separable because it can be written 


dy 


= dx. By integration, arctany =x+c or y =tan(x + c). 


It is very important to introduce the constant of integration immediately when the integration is performed. 
If we wrote arctan y = x, then y = tan x, and then introduced c, we would have obtained y = tan x + c, which 
is not a solution (when c # 0). Verify this. | 
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EXAMPLE 2 


EXAMPLE 3 


EXAMPLE 4 


Separable ODE 


The ODE y’ = (x + L)e~*y? is separable; we obtain y~? dy = (x + le~* dx. 


By integration, yt (x + 2)e™* + ¢, y Ge — =e Py 
Initial Value Problem (IVP). Bell-Shaped Curve 
Solve y’ = —2xy, y(0) = 1.8. 
Solution. By separation and integration, 
dy eed a2 
— = —2x dx, Iny = -x° +c, y=ce 


y 


This is the general solution. From it and the initial condition, y(0) = ce° = c = 1.8. Hence the IVP has the 


solution y = 1.8e~*’. This is a particular solution, representing a bell-shaped curve (Fig. 10). I] 
BA 
1T- 
L 
zy; =i 0 1 De 


Fig. 10. Solution in Example 3 (bell-shaped curve) 


Modeling 


The importance of modeling was emphasized in Sec. 1.1, and separable equations yield 
various useful models. Let us discuss this in terms of some typical examples. 


Radiocarbon Dating” 


In September 1991 the famous Iceman (Oetzi), a mummy from the Neolithic period of the Stone Age found in 
the ice of the Oetztal Alps (hence the name “‘Oetzi’”’) in Southern Tyrolia near the Austrian-Italian border, caused 
a scientific sensation. When did Oetzi approximately live and die if the ratio of carbon 1§C to carbon 12C in 
this mummy is 52.5% of that of a living organism? 

Physical Information. In the atmosphere and in living organisms, the ratio of radioactive carbon 18C (made 
radioactive by cosmic rays) to ordinary carbon 12C is constant. When an organism dies, its absorption of 4$C 
by breathing and eating terminates. Hence one can estimate the age of a fossil by comparing the radioactive 
carbon ratio in the fossil with that in the atmosphere. To do this, one needs to know the half-life of AG; which 
is 5715 years (CRC Handbook of Chemistry and Physics, 83rd ed., Boca Raton: CRC Press, 2002, page 11-52, 
line 9). 


Solution. Modeling. Radioactive decay is governed by the ODE y’ = ky (see Sec. 1.1, Example 5). By 
separation and integration (where f is time and yo is the initial ratio of uC to e)) 
dy 


—=kdt, In |y| = kt +c, y = yoeX* (vo = e°). 


y 


2Method by WILLARD FRANK LIBBY (1908-1980), American chemist, who was awarded for this work 
the 1960 Nobel Prize in chemistry. 


14 


EXAMPLE 5 


CHAP. 1 First-Order ODEs 


Next we use the half-life H = 5715 to determine k. When t = H, half of the original substance is still present. Thus, 


In 0.5 0.693 

kH kH 

Y = 0. = 0. k .0001 213. 
yoe Syo, e 5, H 5715 0.00 3 


Finally, we use the ratio 52.5% for determining the time t when Oetzi died (actually, was killed), 


In 0.525 
Wehaceeg OORT AIS 0505, t= eee 5312. Answer: About 5300 years ago. 


. ~ =0,0001213 


Other methods show that radiocarbon dating values are usually too small. According to recent research, this is 
due to a variation in that carbon ratio because of industrial pollution and other factors, such as nuclear testing. Ml 


Mixing Problem 


Mixing problems occur quite frequently in chemical industry. We explain here how to solve the basic model 
involving a single tank. The tank in Fig. 11 contains 1000 gal of water in which initially 100 lb of salt is dissolved. 
Brine runs in at a rate of 10 gal/min, and each gallon contains 5 lb of dissoved salt. The mixture in the tank is 
kept uniform by stirring. Brine runs out at 10 gal/min. Find the amount of salt in the tank at any time ¢. 


Solution. Step 1. Setting up a model. Let y(t) denote the amount of salt in the tank at time f. Its time rate 
of change is 
y’ = Salt inflow rate — Salt outflow rate Balance law. 


5 Ib times 10 gal gives an inflow of 50 Ib of salt. Now, the outflow is 10 gal of brine. This is 10/1000 = 0.01 
(= 1%) of the total brine content in the tank, hence 0.01 of the salt content y(t), that is, 0.01 y(t). Thus the 
model is the ODE 


(4) y’ = 50 —- 0.0ly 0.01(y — 5000). 


Step 2. Solution of the model. The ODE (4) is separable. Separation, integration, and taking exponents on both 
sides gives 


dy 
——_——~ = —0.01 dt 1 — 5000] = —0.01r + c* ~ 5000 = ce70-01t. 
y — 5000 ? n ly — 5000] = —0 ce, y — 5000 = ce 


Initially the tank contains 100 lb of salt. Hence y(0) = 100 is the initial condition that will give the unique 
solution. Substituting y = 100 and ¢ = 0 in the last equation gives 100 — 5000 = ce = c. Hence c = —4900. 
Hence the amount of salt in the tank at time f is 


(5) y(t) = 5000 — 4900e~°-1", 


This function shows an exponential approach to the limit 5000 lb; see Fig. 11. Can you explain physically that 
y(t) should increase with time? That its limit is 5000 Ib? Can you see the limit directly from the ODE? 

The model discussed becomes more realistic in problems on pollutants in lakes (see Problem Set 1.5, Prob. 35) 
or drugs in organs. These types of problems are more difficult because the mixing may be imperfect and the flow 
rates (in and out) may be different and known only very roughly. ia 


y 
5000 
4000 


—_ — 3000 


=— | 2000 


4 = 1000 
i 
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(e) 100 200 300 400 500 ¢ 


Tank Salt content y(¢) 


Fig. 11. Mixing problem in Example 5 
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EXAMPLE 6 


Heating an Office Building (Newton’s Law of Cooling’) 


Suppose that in winter the daytime temperature in a certain office building is maintained at 70°F. The heating 
is shut off at 10 P M. and turned on again at 6 A M. On a certain day the temperature inside the building at 2 A M. 
was found to be 65°F. The outside temperature was 50°F at 10 PM. and had dropped to 40°F by 6 AM. What 
was the temperature inside the building when the heat was turned on at 6 A M.? 

Physical information. Experiments show that the time rate of change of the temperature T of a body B (which 
conducts heat well, for example, as a copper ball does) is proportional to the difference between T and the 
temperature of the surrounding medium (Newton’s law of cooling). 


Solution. Step 1. Setting up a model. Let T(t) be the temperature inside the building and 7, the outside 
temperature (assumed to be constant in Newton’s law). Then by Newton’s law, 


dT 
(6) an K(T — Ta). 


Such experimental laws are derived under idealized assumptions that rarely hold exactly. However, even if a 
model seems to fit the reality only poorly (as in the present case), it may still give valuable qualitative information. 
To see how good a model is, the engineer will collect experimental data and compare them with calculations 
from the model. 


Step 2. General solution. We cannot solve (6) because we do not know Tq, just that it varied between 50°F 
and 40°F, so we follow the Golden Rule: If you cannot solve your problem, try to solve a simpler one. We 
solve (6) with the unknown function T, replaced with the average of the two known values, or 45°F. For physical 
reasons we may expect that this will give us a reasonable approximate value of 7 in the building at 6 A M. 

For constant JZ, = 45 (or any other constant value) the ODE (6) is separable. Separation, integration, and 
taking exponents gives the general solution 


dT 


Fag 2 In |T — 45| = kt + c*, T(t) = 45 + cekt (c=). 


Step 3. Particular solution. We choose 10 P M. to be t = 0. Then the given initial condition is 7(0) = 70 and 
yields a particular solution, call it 7,. By substitution, 


T(0) = 45 + ce® = 70, c= 70 — 45 = 25, T(t) = 45 + 25e", 


Step 4. Determination of k. We use T(4) = 65, where t = 4 is 2 AM. Solving algebraically for & and inserting 
k into 7,(t) gives (Fig. 12) 


T,(4) = 45 + 25e%* = 65, e** = 0.8, k = 41n0.8 = —0.056, Tp(t) = 45 + 25¢~ 0096", 


Fig. 12. Particular solution (temperature) in Example 6 


3sir ISAAC NEWTON (1642-1727), great English physicist and mathematician, became a professor at 
Cambridge in 1669 and Master of the Mint in 1699. He and the German mathematician and philosopher 
GOTTFRIED WILHELM LEIBNIZ (1646-1716) invented (independently) the differential and integral calculus. 
Newton discovered many basic physical laws and created the method of investigating physical problems by 
means of calculus. His Philosophiae naturalis principia mathematica (Mathematical Principles of Natural 
Philosophy, 1687) contains the development of classical mechanics. His work is of greatest importance to both 
mathematics and physics. 
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EXAMPLE 7 


CHAP. 1 First-Order ODEs 


Step 5. Answer and interpretation. 6 AM. is t = 8 (namely, 8 hours after 10 P M.), and 
Tp(8) = 45 + 25e~ 9-988 = 61/ °F]. 


Hence the temperature in the building dropped 9°F, a result that looks reasonable. | 


Leaking Tank. Outflow of Water Through a Hole (Torricelli’s Law) 


This is another prototype engineering problem that leads to an ODE. It concerns the outflow of water from a 
cylindrical tank with a hole at the bottom (Fig. 13). You are asked to find the height of the water in the tank at 
any time if the tank has diameter 2 m, the hole has diameter | cm, and the initial height of the water when the 
hole is opened is 2.25 m. When will the tank be empty? 

Physical information. Under the influence of gravity the outflowing water has velocity 


(7) v(t) = 0.600 V 2gh(t) (Torricelli’s law’), 


where h(t) is the height of the water above the hole at time ¢, and g = 980 cm/ sec” = 32.17 ft/ sec” is the 
acceleration of gravity at the surface of the earth. 


Solution. Step 1. Setting up the model. To get an equation, we relate the decrease in water level h(t) to the 
outflow. The volume AV of the outflow during a short time A? is 


AV = Av At (A = Area of hole). 
AV must equal the change AV* of the volume of the water in the tank. Now 
AV* = —B Ah (B = Cross-sectional area of tank) 


where Ah (> 0) is the decrease of the height h(t) of the water. The minus sign appears because the volume of 
the water in the tank decreases. Equating AV and AV* gives 


—B Ah = Av At. 


We now express v according to Torricelli’s law and then let Ar (the length of the time interval considered) 
approach 0—this is a standard way of obtaining an ODE as a model. That is, we have 


Ah A A 
At OB v= — 0.600V 2gh(t) 


and by letting At — 0 we obtain the ODE 


dh A 

— = —26.56 — Vh, 

dt B oe 
where 26.56 = 0.600V 2 + 980. This is our model, a first-order ODE. 


Step 2. General solution. Our ODE is separable. A/B is constant. Separation and integration gives 


Ee apg d 2Vh = *— 265644 

Vi B B an c si B 5 

Dividing by 2 and squaring gives h = (c — 13.28At/B)*. Inserting 13.28A/B = 13.28 + 0.5277/1007ar = 0.000332 
yields the general solution 


h(t) = (c — 0.000332r)?. 


4EVANGELISTA TORRICELLI (1608-1647), Italian physicist, pupil and successor of GALILEO GALILEI 
(1564-1642) at Florence. The “contraction factor’ 0.600 was introduced by J. C. BORDA in 1766 because the 
stream has a smaller cross section than the area of the hole. 
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Step 3. Particular solution. The initial height (the initial condition) is h(0) = 225 cm. Substitution of t = 0 
and h = 225 gives from the general solution c? = 225, c = 15.00 and thus the particular solution (Fig. 13) 


hy(t) = (15.00 — 0.0003322)”. 


Step 4. Tank empty. h,(t) = 0 if t = 15.00/0.000332 = #5181 |see| = 12.6 [hours]. 
Here you see distinctly the importance of the choice of units—-we have been working with the cgs system, 
in which time is measured in seconds! We used g = 980 cm/sec”. 


Step 5. Checking. Check the result. s] 


Water level 
at time ¢ 


0 | | 
water 6) 10000 30000 50000 ¢ 


Tank Water level A(¢) in tank 


Fig. 13. Example 7. Outflow from a cylindrical tank (“leaking tank’). 
Torricelli’s law 


Extended Method: Reduction to Separable Form 


Certain nonseparable ODEs can be made separable by transformations that introduce for 
y a new unknown function. We discuss this technique for a class of ODEs of practical 
importance, namely, for equations 


(8) y= (2) 


Here, f is any (differentiable) function of y/x, such as sin(y/x), (y/x), and so on. (Such 
an ODE is sometimes called a homogeneous ODE, a term we shall not use but reserve 
for a more important purpose in Sec. 1.5.) 

The form of such an ODE suggests that we set y/x = u; thus, 


(9) y = ux and by product differentiation y’ =ul'x tu. 


Substitution into y’ = f(y/x) then gives u’x + u = f(u) or u'x = f(u) — u. We see that 
if f(u) — u # O, this can be separated: 


du dx 
(10) er ee 


18 


EXAMPLE 8 


CHAP. 1 First-Order ODEs 


Reduction to Separable Form 


Solve 
Qxyy’ = y? — x?. 


Solution. To get the usual explicit form, divide the given equation by 2xy, 


2xy Ox Dy 
Now substitute y and y’ from (9) and then simplify by subtracting wu on both sides, 


25 ‘ u 1 -uw-1 
Qu 2 2u Qu 


f ae u 
uxtu=— 
2 


You see that in the last equation you can now separate the variables, 


=-—, By integration, In(1 4 u2) In |x| + c* = In 


Take exponents on both sides to get 1 + u? = c/x or 1 + (y/x)? = c/x. Multiply the last equation by x” to 
obtain (Fig. 14) 


2 2 
2 2 c 2 c 
4 = i Th = = + =, 
x y CX. US (: ) y 


This general solution represents a family of circles passing through the origin with centers on the x-axis. Mi 


Fig. 14. General solution (family of circles) in Example 8 


PROBLEM SET 1-3 


1. CAUTION! Constant of integration. Why is it 11-17} INITIAL VALUE PROBLEMS (IVPs) 


important to introduce the constant of integration 
immediately when you integrate? 


Solve the IVP. Show the steps of derivation, beginning with 
the general solution. 


2-10| GENERAL SOLUTION 11. xy +y=0, y(4)=6 


Find a general solution. Show the steps of derivation. Check 12. y’ =|]+ Ay’, y(1) = 0 


your answer by substitution. 


2. y3y’ + x3 =0 


13. y'cosh?x = sin? y, (0) = 477 


qa = sec? y 14. dr/dt = —2tr, r(O) = ro 

4. y' sin 27x = ary cos 27x 15. y’ = —4x/y,  y(2) =3 

5. yy’ + 36x = 0 16. y =(@+y—27, y(0) =2 
6. y' =e hy? (Setv =x + y — 2) 


7. xy’ =y + 2x3 sin ‘ (Set y/x = u) 


8. y' = (y + 4x)? 
xy =y?+y 


a 


2 17. xy’ = y + 3x*cos?(y/x), yl) =0 
(Set y/x = u) 
(Set y + 4x = v) 18. Particular solution. Introduce limits of integration in 
(Set y/x = u) (3) such that y obtained from (3) satisfies the initial 


10. xy’ =x+y (Set y/x = u) condition y(x9) = yo. 
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19-36 | MODELING, APPLICATIONS 


19. 


20. 


21. 


22. 


23. 


24. 


25. 


26. 


27. 


Exponential growth. Ifthe growth rate of the number 
of bacteria at any time ¢ is proportional to the number 
present at ¢ and doubles in 1 week, how many bacteria 
can be expected after 2 weeks? After 4 weeks? 


Another population model. 


(a) If the birth rate and death rate of the number of 
bacteria are proportional to the number of bacteria 
present, what is the population as a function of time. 


(b) What is the limiting situation for increasing time? 
Interpret it. 


Radiocarbon dating. What should be the '¢C content 
(in percent of yg) of a fossilized tree that is claimed to 
be 3000 years old? (See Example 4.) 


Linear accelerators are used in physics for 
accelerating charged particles. Suppose that an alpha 
particle enters an accelerator and undergoes a constant 
acceleration that increases the speed of the particle 
from 10? m/sec to 10* m/sec in 1073 sec. Find the 
acceleration a and the distance traveled during that 
period of 1073 sec. 

Boyle—Mariotte’s law for ideal gases.*> Experiments 
show for a gas at low pressure p (and constant 
temperature) the rate of change of the volume V(p) 
equals —V/p. Solve the model. 


Mixing problem. A tank contains 400 gal of brine 
in which 100 lb of salt are dissolved. Fresh water runs 
into the tank at a rate of 2 gal/min.The mixture, kept 
practically uniform by stirring, runs out at the same 
rate. How much salt will there be in the tank at the 
end of 1 hour? 


Newton’s law of cooling. A thermometer, reading 
5°C, is brought into a room whose temperature is 22°C. 
One minute later the thermometer reading is 12°C. 
How long does it take until the reading is practically 
22°C, say, 21.9°C? 

Gompertz growth in tumors. The Gompertz model 
is y’ = —Aylny(A > 0), where y(t) is the mass of 
tumor cells at time ¢. The model agrees well with 
clinical observations. The declining growth rate with 
increasing y > 1 corresponds to the fact that cells in 
the interior of a tumor may die because of insufficient 
oxygen and nutrients. Use the ODE to discuss the 
growth and decline of solutions (tumors) and to find 
constant solutions. Then solve the ODE. 

Dryer. If a wet sheet in a dryer loses its moisture at 
a rate proportional to its moisture content, and if it 
loses half of its moisture during the first 10 min of 


28. 


29. 


30. 


31. 


32. 
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drying, when will it be practically dry, say, when will 
it have lost 99% of its moisture? First guess, then 
calculate. 


Estimation. Could you see, practically without calcu- 
lation, that the answer in Prob. 27 must lie between 
60 and 70 min? Explain. 


Alibi? Jack, arrested when leaving a bar, claims that 
he has been inside for at least half an hour (which 
would provide him with an alibi). The police check 
the water temperature of his car (parked near the 
entrance of the bar) at the instant of arrest and again 
30 min later, obtaining the values 190°F and 110°F, 
respectively. Do these results give Jack an alibi? 
(Solve by inspection.) 


Rocket. A rocket is shot straight up from the earth, 
with a net acceleration (= acceleration by the rocket 
engine minus gravitational pullback) of 7tfm/ sec” 
during the initial stage of flight until the engine cut out 
at t = 10 sec. How high will it go, air resistance 
neglected? 


Solution curves of y’ = g(y/x). Show that any 
(nonvertical) straight line through the origin of the 
xy-plane intersects all these curves of a given ODE at 
the same angle. 


Friction. If a body slides on a surface, it experiences 
friction F (a force against the direction of motion). 
Experiments show that |F| = y2|N| (Coulomb’s® law of 
kinetic friction without lubrication), where N is the 
normal force (force that holds the two surfaces together; 
see Fig. 15) and the constant of proportionality jw is 
called the coefficient of kinetic friction. In Fig. 15 
assume that the body weighs 45 nt (about 10 lb; see 
front cover for conversion). = 0.20 (corresponding 
to steel on steel), a = 30°, the slide is 10 m long, the 
initial velocity is zero, and air resistance is 
negligible. Find the velocity of the body at the end 
of the slide. 


Problem 32 


Fig. 15. 


®ROBERT BOYLE (1627-1691), English physicist and chemist, one of the founders of the Royal Society. EDME MARIOTTE (about 
1620-1684), French physicist and prior of a monastry near Dijon. They found the law experimentally in 1662 and 1676, respectively. 


®CHARLES AUGUSTIN DE COULOMB (1736-1806), French physicist and engineer. 
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33. 


34. 


CHAP. 1 First-Order ODEs 


Rope. To tie a boat in a harbor, how many times 
must a rope be wound around a bollard (a vertical 
rough cylindrical post fixed on the ground) so that a 
man holding one end of the rope can resist a force 
exerted by the boat 1000 times greater than the man 
can exert? First guess. Experiments show that the 
change AS of the force S in a small portion of the 
rope is proportional to S and to the small angle A¢ 
in Fig. 16. Take the proportionality constant 0.15. 
The result should surprise you! 


: Small 
portion 
of rope 
S+AS 

Fig. 16. Problem 33 


TEAM PROJECT. Family of Curves. A family of 
curves can often be characterized as the general 
solution of y’ = f(x, y). 

(a) Show that for the circles with center at the origin 
we get y’ = —x/y. 

(b) Graph some of the hyperbolas xy = c. Find an 
ODE for them. 

(c) Find an ODE for the straight lines through the 
origin. 

(d) You will see that the product of the right sides of 
the ODEs in (a) and (c) equals —1. Do you recognize 


35. 


36. 


this as the condition for the two families to be 
orthogonal (i.e., to intersect at right angles)? Do your 
graphs confirm this? 


(e) Sketch families of curves of your own choice and 
find their ODEs. Can every family of curves be given 
by an ODE? 


CAS PROJECT. Graphing Solutions. A CAS can 
usually graph solutions, even if they are integrals that 
cannot be evaluated by the usual analytical methods of 
calculus. 

(a) Show this for the five initial value problems 
y= e™, y(0) = 0, +1, £2 graphing all five curves 
on the same axes. 

(b) Graph approximate solution curves, using the first 
few terms of the Maclaurin series (obtained by term- 
wise integration of that of y’) and compare with the 
exact curves. 

(c) Repeat the work in (a) for another ODE and initial 
conditions of your own choice, leading to an integral 
that cannot be evaluated as indicated. 


TEAM PROJECT. Torricelli’s Law. Suppose that 
the tank in Example 7 is hemispherical, of radius R, 
initially full of water, and has an outlet of 5 cm? cross- 
sectional area at the bottom. (Make a sketch.) Set 
up the model for outflow. Indicate what portion of 
your work in Example 7 you can use (so that it can 
become part of the general method independent of the 
shape of the tank). Find the time f to empty the tank 
(a) for any R, (b) for R = 1 m. Plot ¢ as function of 
R. Find the time when h = R/2 (a) for any R, (b) for 
R=1m. 


1.4 Exact ODEs. Integrating Factors 


We recall from calculus that if a function u(x, y) has continuous partial derivatives, its 
differential (also called its total differential) is 


du 


Ou 
—dx + 
Ox 


My 
— dy. 
oy ~ 


From this it follows that if u(x, y) = c = const, then du = 0. 
For example, if u = x + xy? = c, then 


du = (1 + 2xy3) dx + 3x7? dy =0 


or 


1 2xy? 


’ 


3x22 


SEC. 1.4 Exact ODEs. Integrating Factors 21 


an ODE that we can solve by going backward. This idea leads to a powerful solution 
method as follows. 
A first-order ODE M(x, y) + N(x, yyy’ = 0, written as (use dy = y'dx as in Sec. 1.3) 


(1) M(x, y) dx + N(x, y) dy = 0 


is called an exact differential equation if the differential form M(x, y) dx + N(, y) dy 
is exact, that is, this form is the differential 


ou ou 


Uu 
2 du = dx + —d 
- . Ox . ie 


of some function u(x, y). Then (1) can be written 
du = 0. 


By integration we immediately obtain the general solution of (1) in the form 
(3) u(x, y) = ¢. 


This is called an implicit solution, in contrast to a solution y = h(x) as defined in Sec. 
1.1, which is also called an explicit solution, for distinction. Sometimes an implicit solution 
can be converted to explicit form. (Do this for x2 + y? = |.) If this is not possible, your 
CAS may graph a figure of the contour lines (3) of the function u(x, y) and help you in 
understanding the solution. 

Comparing (1) and (2), we see that (1) is an exact differential equation if there is some 
function u(x, y) such that 


4 oF i), =N 
(4) @ao& O&O TaN 


From this we can derive a formula for checking whether (1) is exact or not, as follows. 

Let M and N be continuous and have continuous first partial derivatives in a region in 
the xy-plane whose boundary is a closed curve without self-intersections. Then by partial 
differentiation of (4) (see App. 3.2 for notation), 


aM au 
oy dy ax’ 
aN au 
dx dxdy 


By the assumption of continuity the two second partial derivaties are equal. Thus 


aM _ aN 


(5) dy ax’ 
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EXAMPLE 1 


CHAP. 1 First-Order ODEs 


This condition is not only necessary but also sufficient for (1) to be an exact differential 
equation. (We shall prove this in Sec. 10.2 in another context. Some calculus books, for 
instance, [GenRef 12], also contain a proof.) 

If (1) is exact, the function u(x, y) can be found by inspection or in the following 
systematic way. From (4a) we have by integration with respect to x 


(6) y= [mac + Ko» 


in this integration, y is to be regarded as a constant, and k(y) plays the role of a “constant” 
of integration. To determine k(y), we derive du/dy from (6), use (4b) to get dk/dy, and 
integrate dk/dy to get k. (See Example 1, below.) 

Formula (6) was obtained from (4a). Instead of (4a) we may equally well use (4b). 
Then, instead of (6), we first have by integration with respect to y 


(6*) n= | N dy + I(x). 


To determine /(x), we derive du/dx from (6*), use (4a) to get di/dx, and integrate. We 
illustrate all this by the following typical examples. 


An Exact ODE 


Solve 


(7) cos (x + y) dx + (Gy? + 2y + cos (x + y)) dy = 0. 
Solution. Step 1. Test for exactness. Our equation is of the form (1) with 


M = cos (x + y), 


N = 3y? + 2y + cos (x + y). 


Thus 
oM in ( ) 
= —sin(x + y), 
oy ss 
oN . 
— = —sin(x + y). 
Ox . 


From this and (5) we see that (7) is exact. 


Step 2. Implicit general solution. From (6) we obtain by integration 


(8) u fu dx + k(y) [cos (x + y) dx + k(y) = sin( + y) + kG). 
To find k(y), we differentiate this formula with respect to y and use formula (4b), obtaining 


a 1k 
“= cos (x+y)4 4 N = 3y” + 2y + cos(x + y). 
ay dy 


Hence dk/dy = 3y2 + 2y. By integration, k = y? os y? + c*. Inserting this result into (8) and observing (3), 
we obtain the answer 


u(x, y) = sin(x + y) 4 y> ty e 
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EXAMPLE 2 


EXAMPLE 3 


Step 3. Checking an implicit solution. We can check by differentiating the implicit solution u(x, y) = c 
implicitly and see whether this leads to the given ODE (7): 


ou ou 2 
(9) du F dx + 5 dy = cos (x + y) dx + (cos (x + y) + 3y~ + 2y) dy = 0. 
x y 
This completes the check. | 


An Initial Value Problem 

Solve the initial value problem 

(10) (cos y sinh x + 1) dx — sin y cosh x dy = 0, yl) = 2. 

Solution. You may verify that the given ODE is exact. We find uw. For a change, let us use (6*), 
u= -|siny cosh x dy + I(x) = cos ycoshx + I(x). 


From this, du/dx = cos y sinh x + dl/dx = M = cosy sinhx + 1.Hencedl/dx = 1.By integration, (x) = x + c*. 
This gives the general solution u(x, y) = cos y cosh x + x = c. From the initial condition, cos 2 cosh 1 + 1 = 
0.358 = c. Hence the answer is cos ycoshx + x = 0.358. Figure 17 shows the particular solutions for c = 0, 0.358 
(thicker curve), 1, 2, 3. Check that the answer satisfies the ODE. (Proceed as in Example 1.) Also check that the 
initial condition is satisfied. iad] 


l l 
0 Oo 120 T,5 2:0 2.5: 3.0. x 
Fig. 17. Particular solutions in Example 2 


WARNING! Breakdown in the Case of Nonexactness 


The equation —y dx + x dy = 0 is not exact because M = —y and N = x, so that in (5), M/dy = —1 but 
dN/dx = 1. Let us show that in such a case the present method does not work. From (6), 


7) dk 
u [u dx + k(y) xy + ky), hence ow = = 
oy dy 


Now, du/dy should equal N = x, by (4b). However, this is impossible because k(y) can depend only on y. Try 
(6*); it will also fail. Solve the equation by another method that we have discussed. | 


Reduction to Exact Form. Integrating Factors 


The ODE in Example 3 is —y dx + x dy = 0. It is not exact. However, if we multiply it 
by 1 Frogs we get an exact equation [check exactness by (5)!], 


~ydx + xd I 
a) ae et a =a(2)=0 
X xX 


Xx XxX 


Integration of (11) then gives the general solution y/x = c = const. 
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EXAMPLE 4 


CHAP. 1 First-Order ODEs 
This example gives the idea. All we did was to multiply a given nonexact equation, say, 


(12) P(x, y) dx + Q(x, y) dy = 0, 


by a function F that, in general, will be a function of both x and y. The result was an equation 
(13) FP dx + FQdy =0 


that is exact, so we can solve it as just discussed. Such a function F(x, y) is then called 
an integrating factor of (12). 


Integrating Factor 


The integrating factor in (11) is F = 1/x?. Hence in this case the exact equation (13) is 


—ydx + xdy y y 
FP dx + FQ dy 5 d 0. Solution SSC. 
x x x 


These are straight lines y = cx through the origin. (Note that x = 0 is also a solution of —y dx + x dy = 0.) 
It is remarkable that we can readily find other integrating factors for the equation —y dx + x dy = 0, namely, 
1/y?, 1/(xy), and 1/(x? + y?), because 


—ydx + xdy =—yde Pr xa -ydx +xd 
(14) : 2 d (2) z : d (« *), - 2 d (wan *). | 
x 


y* y xy y x+y? 


How to Find Integrating Factors 


In simpler cases we may find integrating factors by inspection or perhaps after some trials, 
keeping (14) in mind. In the general case, the idea is the following. 

For M dx + N dy = 0 the exactness condition (5) is 0M/dy = dN/dx. Hence for (13), 
FP dx + FQ dy = 0, the exactness condition is 


15 © @eye 
(15) ay FP) = 5 FO). 


By the product rule, with subscripts denoting partial derivatives, this gives 
FyP + FP, = FQ + FQz. 
In the general case, this would be complicated and useless. So we follow the Golden Rule: 
If you cannot solve your problem, try to solve a simpler one—the result may be useful 
(and may also help you later on). Hence we look for an integrating factor depending only 
on one variable: fortunately, in many practical cases, there are such factors, as we shall 
see. Thus, let F = F(x). Then F, = 0, and F;, = Fo = dF/dx, so that (15) becomes 
FP, = F'Q + FQz. 


Dividing by FQ and reshuffling terms, we have 


(16) ——_=R, where R=— 
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This proves the following theorem. 


THEOREM 1 Integrating Factor F (x) 


Tf (12) is such that the right side R of (16) depends only on x, then (12) has an 
integrating factor F = F(x), which is obtained by integrating (16) and taking 
exponents on both sides. 


(17) F(x) = exp | R(x) dx. 


Similarly, if F* = F*(y), then instead of (16) we get 


(18) en, where R*¥ =— 


and we have the companion 


THEOREM 2 Integrating Factor F*(y) 


If (12) is such that the right side R* of (18) depends only on y, then (12) has an 
integrating factor F* = F*(y), which is obtained from (18) in the form 


(19) F*(y) = exp | Rr) dy. 


EXAMPLE 5_ Application of Theorems 1 and 2. Initial Value Problem 


Using Theorem | or 2, find an integrating factor and solve the initial value problem 


(20) (e" *4 + yeY) dx + (xeY — 1)dy = 0, y(0) 1 
Solution. Step 1. Nonexactness. The exactness check fails: 


oP a (eet + ye) = et FY y y 9 8 y 
oy oy Ox Ox 


Step 2. Integrating factor. General solution. Theorem | fails because R [the right side of (16)] depends on 
both x and y. 


R ts 2) : (e774 + eY + ye¥ — e%), 


Try Theorem 2. The right side of (18) is 


1/9Q oP 1 : 
x y x+y y y 
R +(= =) OU 4 vel (e e e ye") 1. 


Hence (19) gives the integrating factor F*(y) = e ¥. From this result and (20) you get the exact equation 


(e* + y) dx + (x — ee) dy = 0. 
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CHAP. 1 First-Order ODEs 


Test for exactness; you will get 1 on both sides of the exactness condition. By integration, using (4a), 


u [Gi + y)dx =e + xy + kX). 
Differentiate this with respect to y and use (4b) to get 
a dk dk 
Wax N end —=-e 4, k=e4%+c%, 
dy dy dy 


Hence the general solution is 


u(x,y) =e” +xy +e 


Yoe, 


Setp 3. Particular solution. The initial condition y(0) = —1 gives u(0, —1) = 1 + 0 + e = 3.72. Hence the 


+ xy + 


eY=1+e 


answer is e” 


3.72. Figure 18 shows several particular solutions obtained as level curves 


of u(x, y) = c, obtained by a CAS, a convenient way in cases in which it is impossible or difficult to cast a 
solution into explicit form. Note the curve that (nearly) satisfies the initial condition. 


Step 4. Checking. Check by substitution that the answer satisfies the given equation as well as the initial 


condition. 


Fig. 18. 


PROBLEM SET 1.4 


1-14 


ODEs. INTEGRATING FACTORS 


Test for exactness. If exact, solve. If not, use an integrating 
factor as given or obtained by inspection or by the theorems 
in the text. Also, if an initial condition is given, find the 
corresponding particular solution. 


1. 


NIA & WN 


2xy dx + x dy =0 


» x8dx + yedy =0 

. sinxcos ydx + cosxsin ydy = 0 
. e(dr + 3rd0) = 0 

. (x2 + y) dx — 2xy dy = 0 

. 3(y + 1) dx = 2xdy, (y+ 1x74 


i 2x tan y dx + sec? ydy = 0 


Particular solutions in Example 5 


8. e*(cos y dx — sin ydy) = 0 


9. e*(2 cos y dx — sin y dy) = 0, 
10. 
11. 
12. 
13. 
14, 


15. 


y(0) = 0 
ydx + [y + tan(x + y)]}dy =0, cos(x + y) 
2 cosh x cos y dx = sinh x sin y dy 
(2xy dx + dye” =0, y0)=2 
e Ydx +e “(-e 4 + 1) dy = 0, 
(a + l)y dx + (b+ 1)xdy = 0, 
Foxy 


F=e"t¥ 


yd) = 1, 


Exactness. Under what conditions for the constants a, 
b,k, lis (ax + by) dx + (kx + ly) dy = 0 exact? Solve 
the exact ODE. 


SEC. 1.5 Linear ODEs. 


16. 


17. 


1. 


TEAM PROJECT. Solution by Several Methods. 
Show this as indicated. Compare the amount of work. 


(a) e4(sinh x dx + cosh x dy) = Oas an exact ODE 
and by separation. 


(b) (1 + 2x) cos ydx + dy/cos y = Oby Theorem 2 
and by separation. 


(c) (x? + y?) dx — 2xy dy = Oby Theorem | or 2 and 
by separation with v = y/x. 


(d) 3x7 y dx + 4x3 dy = 0 by Theorems | and 2 and 
by separation. 


(e) Search the text and the problems for further ODEs 
that can be solved by more than one of the methods 
discussed so far. Make a list of these ODEs. Find 
further cases of your own. 


WRITING PROJECT. Working Backward. 
Working backward from the solution to the problem 
is useful in many areas. Euler, Lagrange, and other 
great masters did it. To get additional insight into 
the idea of integrating factors, start from a u(x, y) of 
your choice, find du = 0, destroy exactness by 
division by some F(x, y), and see what ODE’s 
solvable by integrating factors you can get. Can you 
proceed systematically, beginning with the simplest 
F(x, y)? 


5 Linear ODEs. 
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18. CAS PROJECT. Graphing Particular Solutions. 


Graph particular solutions of the following ODE, 
proceeding as explained. 

(21) dy — y* sin x dx = 0. 

(a) Show that (21) is not exact. Find an integrating 
factor using either Theorem 1| or 2. Solve (21). 

(b) Solve (21) by separating variables. Is this simpler 
than (a)? 

(c) Graph the seven particular solutions satisfying the 
following initial conditions y(0) = 1, y(a7/2) = +5, 
+3, +1 (see figure below). 

(d) Which solution of (21) do we not get in (a) or (b)? 


Particular solutions in CAS Project 18 


Bernoulli Equation. 
Population Dynamics 


Linear ODEs or ODEs that can be transformed to linear form are models of various 
phenomena, for instance, in physics, biology, population dynamics, and ecology, as we 
shall see. A first-order ODE is said to be linear if it can be brought into the form 


(1) 


y’ + p@y = r@), 


by algebra, and nonlinear if it cannot be brought into this form. 

The defining feature of the linear ODE (1) is that it is linear in both the unknown 
function y and its derivative y’ = dy/dx, whereas p and r may be any given functions of 
x. If in an application the independent variable is time, we write ¢ instead of x. 

If the first term is f(x)y’ (instead of y’), divide the equation by f(x) to get the standard 
form (1), with y’ as the first term, which is practical. 

For instance, y’ cosx + ysinx =x is a linear ODE, and its standard form is 


y’ + ytanx = x sec x. 


The function r(x) on the right may be a force, and the solution y(x) a displacement in 


a motion or an electrical current or some other physical quantity. In engineering, r(x) is 
frequently called the input, and y(x) is called the output or the response to the input (and, 
if given, to the initial condition). 
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CHAP. 1 First-Order ODEs 


Homogeneous Linear ODE. We want to solve (1) in some interval a < x < b, call 
it J, and we begin with the simpler special case that r(x) is zero for all x in J. (This is 
sometimes written r(x) = 0.) Then the ODE (1) becomes 


(2) y’ + p@y =0 


and is called homogeneous. By separating variables and integrating we then obtain 
dy 

— = —p(x)dx, thus In ly| = — | p(x)dx + c*. 

y 


Taking exponents on both sides, we obtain the general solution of the homogeneous 
ODE (2), 


(3) Vo)\=ce (c = +e when y=0); 


here we may also choose c = 0 and obtain the trivial solution y(x) = 0 for all x in that 
interval. 


Nonhomogeneous Linear ODE. We now solve (1) in the case that r(x) in (1) is not 
everywhere zero in the interval J considered. Then the ODE (1) is called nonhomogeneous. 
It turns out that in this case, (1) has a pleasant property; namely, it has an integrating factor 
depending only on x. We can find this factor F(x) by Theorem | in the previous section 
or we can proceed directly, as follows. We multiply (1) by F(x), obtaining 
(1*) Fy’ + pFy = rF. 
The left side is the derivative (Fy)’ = F’y + Fy’ of the product Fy if 
pFy = F'y, thus pF =F’. 

By separating variables, dF/F = p dx. By integration, writing h = fp dx, 

In|Fl =h= [pax thus Foe" 


With this F and h’ = p, Eq. (1*) becomes 
ery’ + hey = ely! + (ey = (e"yy’ = re™ 
By integration, 
e"y = [em dx +c. 


Dividing by e”, we obtain the desired solution formula 


(4) y(x) = e*( | er dx + °) h= | p(x) dx. 
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EXAMPLE 1 


EXAMPLE 2 


This reduces solving (1) to the generally simpler task of evaluating integrals. For ODEs 
for which this is still difficult, you may have to use a numeric method for integrals from 
Sec. 19.5 or for the ODE itself from Sec. 21.1. We mention that / has nothing to do with 
h(x) in Sec. 1.1 and that the constant of integration in h does not matter; see Prob. 2. 

The structure of (4) is interesting. The only quantity depending on a given initial 
condition is c. Accordingly, writing (4) as a sum of two terms, 


4") v0 =e] ord + ce 
we see the following: 


(5) Total Output = Response to the Input r + Response to the Initial Data. 


First-Order ODE, General Solution, Initial Value Problem 


Solve the initial value problem 


y’ + y tan x = sin 2x, y(0) = 1. 
Solution. Here p = tanx,r = sin 2x = 2 sinx cos x, and 


h= [ra = [tan vax = In |sec x]. 
From this we see that in (4), 


h —h 1 F i 
e = sec x, ee =cosx, e'r = (sec x)(2 sin x cos x) = 2 sin x, 


and the general solution of our equation is 
y(x) = cos x(2 [sin xdx + c) = ccosx — 2 cos2x. 


From this and the initial condition, 1 =c-1—2- Ir; thus c = 3 and the solution of our initial value problem 
is y = 3cosx — 2 cos” x. Here 3 cos x is the response to the initial data, and —2 cos” x is the response to the 
input sin 2x. B 


Electric Circuit 


Model the RL-circuit in Fig. 19 and solve the resulting ODE for the current /(t) A (amperes), where ¢ is 
time. Assume that the circuit contains as an EMF E(f) (electromotive force) a battery of E = 48 V (volts), which 
is constant, a resistor of R = 11 © (ohms), and an inductor of L = 0.1 H (henrys), and that the current is initially 
Zero. 


Physical Laws. A current / in the circuit causes a voltage drop RI across the resistor (Ohm’s law) and 
a voltage drop LI’ = L dI/dt across the conductor, and the sum of these two voltage drops equals the EMF 
(Kirchhoff’s Voltage Law, KVL). 


Remark. In general, KVL states that “The voltage (the electromotive force EMF) impressed on a closed 
loop is equal to the sum of the voltage drops across all the other elements of the loop.” For Kirchoff’s Current 
Law (KCL) and historical information, see footnote 7 in Sec. 2.9. 


Solution. According to these laws the model of the RL-circuit is LI’ + RI = E(t), in standard form 


R 
(6) l+—-f=—, 
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CHAP. 1 First-Order ODEs 


We can solve this linear ODE by (4) with x = t, y = I, p = R/L, h = (R/L)t, obtaining the general solution 
f= ene | tn EO a m c) 
L F 


By integration, 


R/L 
(7) l= eon E aa c) = + ce P/M 
L R/L R 


In our case, R/L = 11/0.1 = 110 and E(t) = 48/0.1 = 480 = const; thus, 


7 110t 


In modeling, one often gets better insight into the nature of a solution (and smaller roundoff errors) by inserting 
given numeric data only near the end. Here, the general solution (7) shows that the current approaches the limit 
E/R = 48/11 faster the larger R/L is, in our case, R/L = 11/0.1 = 110, and the approach is very fast, from 
below if 1(0) < 48/11 or from above if (0) > 48/11. If (0) = 48/11, the solution is constant (48/11 A). See 
Fig. 19. 

The initial value 1(0) = 0 gives (0) = E/R +c =0,¢ E/R and the particular solution 


E 48 
8 T= =(1 — e7 B/D, ‘his T= (1 — e711, 
(8) R' ) n° PI 
I(t) 
8 
R=110 
AY, 6 
e aL 
H=48V 
@ 
2 
OQ ) fl 
£20.10 0.01 0.02 0.03 0.04 0.05 t 
Circuit Current I(¢) 


Fig. 19. RL-circuit 


Hormone Level 


Assume that the level of a certain hormone in the blood of a patient varies with time. Suppose that the time rate 
of change is the difference between a sinusoidal input of a 24-hour period from the thyroid gland and a continuous 
removal rate proportional to the level present. Set up a model for the hormone level in the blood and find its 
general solution. Find the particular solution satisfying a suitable initial condition. 


Solution. Step 1. Setting up a model. Let y(t) be the hormone level at time t. Then the removal rate is Ky(*). 
The input rate is A + B cos wt, where w = 277/24 = 77/12 and A is the average input rate; here A 2 B to make 
the input rate nonnegative. The constants A, B, K can be determined from measurements. Hence the model is the 
linear ODE 


y(t) = In — Out = A + Bcos wt — Ky(t), thus y’ + Ky =A + Boos at. 


The initial condition for a particular solution Ypar iS Ypan(O) = yo with t = 0 suitably chosen, for example, 
6:00 AM. 


Step 2. General solution. In (4) we have p = K = const, h = Kt, and r = A + Bcos wt. Hence (4) gives the 
general solution (evaluate sex cos wt dt by integration by parts) 
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y(t) = ett | ett (4 + Boos wr) dt + ce ** 


A B 
= erkigk| 4 + — 2 (Keos ot + w sin wr) + ce~Kt 
K K*+4+@2 
A B ( Tt oT =) ey. 
n K cos t sin reer. 
K K24 (1/12)? 12 12 12 


The last term decreases to 0 as f increases, practically after a short time and regardless of c (that is, of the initial 
condition). The other part of y(t) is called the steady-state solution because it consists of constant and periodic 
terms. The entire solution is called the transient-state solution because it models the transition from rest to the 
steady state. These terms are used quite generally for physical and other systems whose behavior depends on time. 


Step 3. Particular solution. Setting t = 0 in y(t) and choosing yp = 0, we have 


A B u A KB 
y(0) er 4 K+c=0, thus c 3 5: 
KK? 4+ (7/12)? 7 KK? + (@/12) 


Inserting this result into y(t), we obtain the particular solution 


w=44 B (« wt 7 =) (4 KB ) =x 
i + cos sin + e 
rea KK? + (ar /12? 12 12° 12 K K2 + (@r/12)? 


with the steady-state part as before. To plot ypax we must specify values for the constants, say, A = B = 1 
and K = 0.05. Figure 20 shows this solution. Notice that the transition period is relatively short (although 
K is small), and the curve soon looks sinusoidal; this is the response to the input A + B cos (qb 71) = 
1 + cos & Tt). 


() 1 l ! 
O 100 


1 
200 t 


Fig. 20. Particular solution in Example 3 


Reduction to Linear Form. Bernoulli Equation 


Numerous applications can be modeled by ODEs that are nonlinear but can be transformed 
to linear ODEs. One of the most useful ones of these is the Bernoulli equation’ 


(9) y+ p@y = g@)y® (a any real number). 


7JAKOB BERNOULLI (1654-1705), Swiss mathematician, professor at Basel, also known for his contribution 
to elasticity theory and mathematical probability. The method for solving Bernoulli’s equation was discovered by 
Leibniz in 1696. Jakob Bernoulli’s students included his nephew NIKLAUS BERNOULLI (1687-1759), who 
contributed to probability theory and infinite series, and his youngest brother JOHANN BERNOULLI (1667-1748), 
who had profound influence on the development of calculus, became Jakob’s successor at Basel, and had among 
his students GABRIEL CRAMER (see Sec. 7.7) and LEONHARD EULER (see Sec. 2.5). His son DANIEL 
BERNOULLI (1700-1782) is known for his basic work in fluid flow and the kinetic theory of gases. 
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EXAMPLE 4 


CHAP. 1 First-Order ODEs 


If a = 0 or a = 1, Equation (9) is linear. Otherwise it is nonlinear. Then we set 


u(x) = eve Nimans 


We differentiate this and substitute y’ from (9), obtaining 


u’ = (1 — ajy~°y’ = (1 — ay “gy” — py). 


Simplification gives 
uw! = (1 — a(g — py’, 


a 


where y!~* = u on the right, so that we get the linear ODE 


(10) u’ + (1 — a)pu = (1 — ag. 


For further ODEs reducible to linear form, see Ince’s classic [A11] listed in App. 1. See 


also Team Project 30 in Problem Set 1.5. 


Logistic Equation 


Solve the following Bernoulli equation, known as the logistic equation (or Verhulst equation’): 


a) y’ = Ay — By” 


Solution. Write (11) in the form (9), that is, 


y’ — Ay = —By” 


to see that a = 2, so that u = ye = yh. Differentiate this u and substitute y’ from (11), 
ul = ~y~y! = ~y "(Ay — By?) = B- Ay 1. 
The last term is Ay! = —Au. Hence we have obtained the linear ODE 
ul + Au = B. 


The general solution is [by (4)] 
u = ce At + B/A. 
Since u = 1/y, this gives the general solution of (11), 


1 1 
(12) SS 
eu ceAt + B/A 


Directly from (11) we see that y = 0 (y(t) = 0 for all 4) is also a solution. 


(Fig. 21) 


8PIERRE-FRANCOIS VERHULST, Belgian statistician, who introduced Eq. (8) as a model for human 


population growth in 1838. 
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EXAMPLE 5 


Population y 
6 
As 
vie 4 
2 
| l l | 
) 1 2 3 4 Time t 


Fig. 21. Logistic population model. Curves (9) in Example 4 with A/B = 4 


Population Dynamics 


The logistic equation (11) plays an important role in population dynamics, a field 
that models the evolution of populations of plants, animals, or humans over time f. 
If B = 0, then (11) is y’ = dy/dt = Ay. In this case its solution (12) is y = (1/c)e“* 
and gives exponential growth, as for a small population in a large country (the 
United States in early times!). This is called Malthus’s law. (See also Example 3 in 
Sec. 1.1.) 

The term —By? in (11) is a “braking term” that prevents the population from growing 
without bound. Indeed, if we write y’ = Ay[1 — (B/A)y], we see that if y < A/B, then 
y’ > 0, so that an initially small population keeps growing as long as y < A/B. But if 
y > A/B, then y’ < 0 and the population is decreasing as long as y > A/B. The limit 
is the same in both cases, namely, A/B. See Fig. 21. 

We see that in the logistic equation (11) the independent variable t does not occur 
explicitly. An ODE y’ = f(t, y) in which f does not occur explicitly is of the form 


(13) y =f0) 


and is called an autonomous ODE. Thus the logistic equation (11) is autonomous. 

Equation (13) has constant solutions, called equilibrium solutions or equilibrium 
points. These are determined by the zeros of f(y), because f(y) = 0 gives y’ = 0 by 
(13); hence y = const. These zeros are known as critical points of (13). An 
equilibrium solution is called stable if solutions close to it for some ¢ remain close 
to it for all further ¢. It is called unstable if solutions initially close to it do not remain 
close to it as ¢ increases. For instance, y = O in Fig. 21 is an unstable equilibrium 
solution, and y = 4 is a stable one. Note that (11) has the critical points y = 0 and 
y=A/B. 


Stable and Unstable Equilibrium Solutions. “Phase Line Plot” 


The ODE y’ = (y — 1)(y — 2)has the stable equilibrium solution y; = 1 and the unstable y, = 2, as the direction 
field in Fig. 22 suggests. The values y, and y, are the zeros of the parabola f(y) = (vy — 1)(y — 2) in the figure. 
Now, since the ODE is autonomous, we can “condense” the direction field to a “phase line plot” giving y, and 
y2, and the direction (upward or downward) of the arrows in the field, and thus giving information about the 
stability or instability of the equilibrium solutions. | 
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Fig. 22. Example 5. (A) Direction field. (B) “Phase line”. (C) Parabola f(y) 


A few further population models will be discussed in the problem set. For some more 
details of population dynamics, see C. W. Clark. Mathematical Bioeconomics: The 
Mathematics of Conservation 3rd ed. Hoboken, NJ, Wiley, 2010. 

Further applications of linear ODEs follow in the next section. 


1. CAUTION! Show that e7™* = 1/x (not —x) and 


e7 nsec x) cos x. 


2. Integration constant. Give a reason why in (4) you may 
choose the constant of integration in fp dx to be zero. 


GENERAL SOLUTION. INITIAL VALUE 
PROBLEMS 
Find the general solution. If an initial condition is given, 


find also the corresponding particular solution and graph or 
sketch it. (Show the details of your work.) 


3. y' —y=5.2 

4, y' = 2y — 4x 

5.y +k =e 

6. y' + 2y=4cos2x, y(47r) = 3 

7. xy’ = 2y + x36 

8. y’ + ytanx = e 9 cos x, y(0) = 0 
9. y' + ysinx = e%*, (0) = —2.5 


10. y’ cosx + Gy — 1)secx = 0, 
11. y’ = (y — 2)cotx 

12. xy’ + 4y = 8x*, yl) =2 
13. y’ = 6(y — 2.5)tanh 1.5x 


yQqm) = 4/3 


PROBLEEM—SET 1-5 


14. CAS EXPERIMENT. (a) Solve the ODEy’ — y/x = 
—x~teos (1 /x).Find an initial condition for which the 
arbitrary constant becomes zero. Graph the resulting 
particular solution, experimenting to obtain a good 


figure near x = 0. 


(b) Generalizing (a) from n = | to arbitrary n, solve the 
ODE y’ — ny/x = =x" cos (1/x). Find an initial 
condition as in (a) and experiment with the graph. 


15-20; GENERAL PROPERTIES OF LINEAR ODEs 


These properties are of practical and theoretical importance 
because they enable us to obtain new solutions from given 
ones. Thus in modeling, whenever possible, we prefer linear 
ODEs over nonlinear ones, which have no similar properties. 

Show that nonhomogeneous linear ODEs (1) and homo- 
geneous linear ODEs (2) have the following properties. 
Illustrate each property by a calculation for two or three 
equations of your choice. Give proofs. 


15. The sum y; + yo of two solutions y; and yo of the 
homogeneous equation (2) is a solution of (2), and so is 
a scalar multiple ay; for any constant a. These properties 
are not true for (1)! 
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16. 


17. 


18. 
19. 
20. 


21. 


y = 0 (that is, y(x) = 0 for all x, also written y(x) = 0) 
is a solution of (2) [not of (1) if r(x) # O!], called the 
trivial solution. 

The sum of a solution of (1) and a solution of (2) is a 
solution of (1). 

The difference of two solutions of (1) is a solution of (2). 
If yy is a solution of (1), what can you say about cy,? 
If y; and yg are solutions of y; + py; = 7, and 
yg + py2 = re, respectively (with the same p!), what 
can you say about the sum y; + yg? 

Variation of parameter. Another method of obtaining 
(4) results from the following idea. Write (3) as cy*, 
where y* is the exponential function, which is a solution 


of the homogeneous linear ODE y*’ + py* = 0. 


Replace the arbitrary constant c in (3) with a function 
u to be determined so that the resulting function y = uy* 
is a solution of the nonhomogeneous linear ODE 
y’ + py=r. 
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NONLINEAR ODEs 


Using a method of this section or separating variables, find 
the general solution. If an initial condition is given, find 
also the particular solution and sketch or graph it. 


22. 
23. 
24. 
25. 
26. 
27. 
28. 
29. 


30. 


yo ty=y?, (0) = -3 
y txy=xy 4, yO) =3 
y y= may 

y' =3.2y- 10y 


y’ = (tan y)/(x — 1), yO) = 27 

y’ = 1/(6e" — 2x) 

Qxyy’ + (x — Ly? = xe" (Set y? = z) 

REPORT PROJECT. Transformation of ODEs. 
We have transformed ODEs to separable form, to exact 
form, and to linear form. The purpose of such 
transformations is an extension of solution methods to 
larger classes of ODEs. Describe the key idea of each 
of these transformations and give three typical exam- 
ples of your choice for each transformation. Show each 
step (not just the transformed ODE). 

TEAM PROJECT. Riccati Equation. Clairaut 
Equation. Singular Solution. 

A Riccati equation is of the form 


(14) 


2 


y’ + pQdy = g(xy” + AQ). 


A Clairaut equation is of the form 
(15) y = xy" + 36’). 


(a) Apply the transformation y = Y + 1/u to the 
Riccati equation (14), where Y is a solution of (14), and 
obtain for u the linear ODE u'’ + (2¥g — p)u = —g. 
Explain the effect of the transformation by writing it 
asy = Yt+o,v = 1/u. 
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(b) Show that y= Y= x is a solution of the ODE 
y’ — (2x3 +1) y = —x7y? — x4 — x + Land solve this 
Riccati equation, showing the details. 
2 


(c) Solve the Clairaut equation y’* — xy’ + y = 0 as 
follows. Differentiate it with respect to x, obtaining 
y"(2y' — x) = 0. Then solve (A) y” =0 and (B) 
2y’ — x = Oseparately and substitute the two solutions 
(a) and (b) of (A) and (B) into the given ODE. Thus 
obtain (a) a general solution (straight lines) and (b) a 
parabola for which those lines (a) are tangents (Fig. 6 
in Prob. Set 1.1); so (b) is the envelope of (a). Such a 
solution (b) that cannot be obtained from a general 
solution is called a singular solution. 

(d) Show that the Clairaut equation (15) has as 
solutions a family of straight lines y = cx + g(c) and 
a singular solution determined by g’(s) = —x, where 
s = y’, that forms the envelope of that family. 
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MODELING. FURTHER APPLICATIONS 


31. 


32. 


33. 


34. 


35. 


Newton’s law of cooling. If the temperature of a cake 
is 300°F when it leaves the oven and is 200°F ten 
minutes later, when will it be practically equal to the 
room temperature of 60°F, say, when will it be 61°F? 
Heating and cooling of a building. Heating and 
cooling of a building can be modeled by the ODE 


T =k 


T,) + kT 


where T = 7(f) is the temperature in the building at 
time ¢, J, the outside temperature, 7, the temperature 
wanted in the building, and P the rate of increase of T 
due to machines and people in the building, and k, and 
kg are (negative) constants. Solve this ODE, assuming 
P = const, J, = const, and J, varying sinusoidally 
over 24 hours, say, Z, = A — Ccos(277/24)t.Discuss 
the effect of each term of the equation on the solution. 


Drug injection. Find and solve the model for drug 
injection into the bloodstream if, beginning at tf = 0,a 
constant amount A g/min is injected and the drug is 
simultaneously removed at a rate proportional to the 
amount of the drug present at time f. 


Epidemics. A model for the spread of contagious 
diseases is obtained by assuming that the rate of spread 
is proportional to the number of contacts between 
infected and noninfected persons, who are assumed to 
move freely among each other. Set up the model. Find 
the equilibrium solutions and indicate their stability or 
instability. Solve the ODE. Find the limit of the 
proportion of infected persons as t—> oo and explain 
what it means. 


Lake Erie. Lake Erie has a water volume of about 
450 km?and a flow rate (in and out) of about 175 km? 
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36. 


37. 


38. 


1.6 Orthogonal Trajectories. 
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per year. If at some instant the lake has pollution 
concentration p = 0.04%, how long, approximately, 
will it take to decrease it to p/2, assuming that the 
inflow is much cleaner, say, it has pollution 
concentration p/4, and the mixture is uniform (an 
assumption that is only imperfectly true)? First guess. 


Harvesting renewable resources. Fishing. Suppose 
that the population y(f) of a certain kind of fish is given 
by the logistic equation (11), and fish are caught at a 
rate Hy proportional to y. Solve this so-called Schaefer 
model. Find the equilibrium solutions y; and yo (> 0) 
when H < A. The expression Y = Hyg is called 
the equilibrium harvest or sustainable yield corre- 
sponding to H. Why? 


Harvesting. In Prob. 36 find and graph the solution 
satisfying y(0) = 2 when (for simplicity) A = B = 1 
and H = 0.2. What is the limit? What does it mean? 
What if there were no fishing? 


Intermittent harvesting. In Prob. 36 assume that you 
fish for 3 years, then fishing is banned for the next 
3 years. Thereafter you start again. And so on. This is 
called intermittent harvesting. Describe qualitatively 
how the population will develop if intermitting is 
continued periodically. Find and graph the solution for 
the first 9 years, assuming that A = B = 1, H = 0.2, 
and y(0) = 2. 


39. 


40. 


0 2 4 6 8 t 


Fig. 23. Fish population in Problem 38 
Extinction vs. unlimited growth. If in a population 
y(t) the death rate is proportional to the population, and 
the birth rate is proportional to the chance encounters 
of meeting mates for reproduction, what will the model 
be? Without solving, find out what will eventually 
happen to a small initial population. To a large one. 
Then solve the model. 


Air circulation. In a room containing 20,000 ft? of air, 
600 ft#of fresh air flows in per minute, and the mixture 
(made practically uniform by circulating fans) is 
exhausted at a rate of 600 cubic feet per minute (cfm). 
What is the amount of fresh air y(‘) at any time if 
y(0) = 0? After what time will 90% of the air be fresh? 


Optional 


An important type of problem in physics or geometry is to find a family of curves that 
intersects a given family of curves at right angles. The new curves are called orthogonal 
trajectories of the given curves (and conversely). Examples are curves of equal 
temperature (isotherms) and curves of heat flow, curves of equal altitude (contour lines) 
on a map and curves of steepest descent on that map, curves of equal potential 
(equipotential curves, curves of equal voltage—the ellipses in Fig. 24) and curves of 


electric force (the parabolas in Fig. 24). 


Here the angle of intersection between two curves is defined to be the angle between 
the tangents of the curves at the intersection point. Orthogonal is another word for 


perpendicular. 


In many cases orthogonal trajectories can be found using ODEs. In general, if we 
consider G(x, y, c) = 0 to be a given family of curves in the xy-plane, then each value of 
c gives a particular curve. Since c is one parameter, such a family is called a one- 


parameter family of curves. 


In detail, let us explain this method by a family of ellipses 


(1) 


(c > 0) 
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and illustrated in Fig. 24. We assume that this family of ellipses represents electric 
equipotential curves between the two black ellipses (equipotential surfaces between two 
elliptic cylinders in space, of which Fig. 24 shows a cross-section). We seek the 
orthogonal trajectories, the curves of electric force. Equation (1) is a one-parameter family 
with parameter c. Each value of c (> 0) corresponds to one of these ellipses. 


Step I. Find an ODE for which the given family is a general solution. Of course, this 
ODE must no longer contain the parameter c. Differentiating (1), we have x + 2yy’ = 0. 
Hence the ODE of the given curves is 


(2) y’ =f y) = 3 


Fig. 24. Electrostatic field between two ellipses (elliptic cylinders in space): 
Elliptic equipotential curves (equipotential surfaces) and orthogonal 
trajectories (parabolas) 


Step 2. Find an ODE for the orthogonal trajectories y = y(x). This ODE is 


(3) = nye 
an co) er 


with the same fas in (2). Why? Well, a given curve passing through a point (xo, yo) has 
slope f(Xo, yo) at that point, by (2). The trajectory through (xo, yo) has slope —1/f(x9, yo) 
by (3). The product of these slopes is —1, as we see. From calculus it is known that this 
is the condition for orthogonality (perpendicularity) of two straight lines (the tangents at 
(Xo, Yo)), hence of the curve and its orthogonal trajectory at (xo, yo). 


Step 3. Solve (3) by separating variables, integrating, and taking exponents: 


dy dx 2 3 
= = 2-, In|y| = 2Inx + c, yHeax", 
y x 


This is the family of orthogonal trajectories, the quadratic parabolas along which electrons 
or other charged particles (of very small mass) would move in the electric field between 
the black ellipses (elliptic cylinders). 
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PROBLEEM—SET 1-6 


1-3 


FAMILIES OF CURVES 


Represent the given family of curves in the form 
G(x, y; c) = Oand sketch some of the curves. 


1. All ellipses with foci —3 and 3 on the x-axis. 

2. All circles with centers on the cubic parabola y = x? 
and passing through the origin (0, 0). 

3. The catenaries obtained by translating the catenary 
y = cosh xin the direction of the straight line y = x. 

4-10| ORTHOGONAL TRAJECTORIES (OTs) 


Sketch or graph some of the given curves. Guess what their 
OTs may look like. Find these OTs. 


4. 
6. 
8. 
10. 


y=x* +e 5. y = cx 


x=c 7. y =c/x? 
y=Vxte 9 y= ce™ 


x7 + (y - ce)? = 
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APPLICATIONS, EXTENSIONS 


11. 


12. 


Electric field. Let the electric equipotential lines 
(curves of constant potential) between two concentric 
cylinders with the z-axis in space be given by 
u(x, y) = x? + y* = c (these are circular cylinders in 
the xyz-space). Using the method in the text, find their 
orthogonal trajectories (the curves of electric force). 


Electric field. The lines of electric force of two opposite 
charges of the same strength at (—1, 0) and (1, 0) are 
the circles through (—1, 0) and (1, 0). Show that these 
circles are given by x2 + yy - c =1+c2 Show 
that the equipotential lines (which are orthogonal 
trajectories of those circles) are the circles given by 
(x + c*)? + 9? = c*2 — | (dashed in Fig. 25). 


13. 


14. 


15. 


16. 


Fig. 25. Electric field in Problem 12 


Temperature field. Let the isotherms (curves of 
constant temperature) in a body in the upper half-plane 
y > 0 be given by 4x? + 9y =. Find the ortho- 
gonal trajectories (the curves along which heat will 
flow in regions filled with heat-conducting material and 
free of heat sources or heat sinks). 


Conic sections. Find the conditions under which 
the orthogonal trajectories of families of ellipses 
x?/a2 + y/b? = c are again conic sections. Illustrate 
your result graphically by sketches or by using your 
CAS. What happens if a— 0? If b—0? 


Cauchy—Riemann equations. Show that for a family 
u(x, y) = c = const the orthogonal trajectories v(x, y) = 
const can be obtained from the following 
Cauchy—Riemann equations (which are basic in 
complex analysis in Chap. 13) and use them to find the 
orthogonal trajectories of e” sin y = const. (Here, sub- 
scripts denote partial derivatives.) 


ce= 


Uy = Vy, Uy = Vy 


Congruent OTs. If y’ = f(x) with findependent of y, 
show that the curves of the corresponding family are 
congruent, and so are their OTs. 


|./ Existence and Uniqueness of Solutions 
for Initial Value Problems 


The initial value problem 


ly’| + ly] =0, 


yO = 1 


has no solution because y = 0 (that is, yx) = 0 for all x) is the only solution of the ODE. 


The initial value problem 


yO) = 1 
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has precisely one solution, namely, y = x2 + 1. The initial value problem 
xy =y-1, yO) = 1 
has infinitely many solutions, namely, y = | + cx, where c is an arbitrary constant because 


y(O) = | for all c. 
From these examples we see that an initial value problem 


(1) y =f@y), yo) = yo 


may have no solution, precisely one solution, or more than one solution. This fact leads 
to the following two fundamental questions. 


Problem of Existence 


Under what conditions does an initial value problem of the form (1) have at least 
one solution (hence one or several solutions)? 


Problem of Uniqueness 


Under what conditions does that problem have at most one solution (hence excluding 
the case that is has more than one solution)? 


Theorems that state such conditions are called existence theorems and uniqueness 
theorems, respectively. 

Of course, for our simple examples, we need no theorems because we can solve these 
examples by inspection; however, for complicated ODEs such theorems may be of 
considerable practical importance. Even when you are sure that your physical or other 
system behaves uniquely, occasionally your model may be oversimplified and may not 
give a faithful picture of reality. 


Existence Theorem 


Let the right side f(x, y) of the ODE in the initial value problem 
() y'=fly), — y@o) = Yo 
be continuous at all points (x, y) in some rectangle 

R: |x — xol < a, ly — yol < b (Fig. 26) 
and bounded in R; that is, there is a number K such that 
(2) Yan ak for all (x, y) in R. 
Then the initial value problem (1) has at least one solution y(x). This solution exists 


at least for all x in the subinterval lx —x9l <a of the interval Ix —xol <a; 
here, a is the smaller of the two numbers a and b/K. 
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Fig. 26. Rectangle R in the existence and uniqueness theorems 
(Example of Boundedness. The function f(x, y) = x2 + y” is bounded (with K = 2) in the 


square Ix] <1, ly| < 1. The function f(x,y) =tan(« + y) is not bounded for 
|x + y| < 7/2. Explain!) 


Uniqueness Theorem 


Let f and its partial derivative fy = df/dy be continuous for all (x, y) in the rectangle 
R (Fig. 26) and bounded, say, 


(3) (a) [ft y)| SK, (b) lf y»l| SM for all (x, y) in R. 
Then the initial value problem (1) has at most one solution y(x). Thus, by Theorem 1, 


the problem has precisely one solution. This solution exists at least for all x in that 
subinterval |x — Xxol <a. 


Understanding These Theorems 


These two theorems take care of almost all practical cases. Theorem | says that if f(x, y) 
is continuous in some region in the xy-plane containing the point (xo, yo), then the initial 
value problem (1) has at least one solution. 

Theorem 2 says that if, moreover, the partial derivative df/dy of f with respect to y 
exists and is continuous in that region, then (1) can have at most one solution; hence, by 
Theorem 1, it has precisely one solution. 

Read again what you have just read—these are entirely new ideas in our discussion. 

Proofs of these theorems are beyond the level of this book (see Ref. [A11] in App. 1); 
however, the following remarks and examples may help you to a good understanding of 
the theorems. 

Since y’ = f(x, y), the condition (2) implies that |y’| S K; that is, the slope of any 
solution curve y(x) in R is at least —K and at most K. Hence a solution curve that passes 
through the point (x9, yo) must lie in the colored region in Fig. 27 bounded by the lines 
1, and J/g whose slopes are —K and K, respectively. Depending on the form of R, two 
different cases may arise. In the first case, shown in Fig. 27a, we have b/K 2 a and 
therefore a = a in the existence theorem, which then asserts that the solution exists for all 
x between x9 — a and xq + a. In the second case, shown in Fig. 27b, we have b/K < a. 
Therefore, a = b/K < a, and all we can conclude from the theorems is that the solution 
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exists for all x between x9 — b/K and xq + b/K. For larger or smaller x’s the solution 
curve may leave the rectangle R, and since nothing is assumed about f outside R, nothing 
can be concluded about the solution for those larger or amaller x’s; that is, for such x’s 
the solution may or may not exist—we don’t know. 


y Jy 
¥otb 

Yo 

Yo-5 

Joao 
<— =a ><" a=a— I< a >< a > 
° ° 
Ke 7 Xo Ed 
(a) (b) 


Fig. 27. The condition (2) of the existence theorem. (a) First case. (b) Second case 


Let us illustrate our discussion with a simple example. We shall see that our choice of 
a rectangle R with a large base (a long x-interval) will lead to the case in Fig. 27b. 


Choice of a Rectangle 


Consider the initial value problem 


yl =1ty’, y(0) = 0 


and take the rectangle R; |x| < 5, |y| < 3. Then a = 5,b = 3, and 
If, y)| = [1 + y?| = K = 10, 


of 
=| = 2\|y| SM =6, 
dy 
b 
a= 7 = 03 <a. 


Indeed, the solution of the problem is y = tan x (see Sec. 1.3, Example 1). This solution is discontinuous at 
+ 77/2, and there is no continuous solution valid in the entire interval |x| < 5 from which we started. | 


The conditions in the two theorems are sufficient conditions rather than necessary ones, 
and can be lessened. In particular, by the mean value theorem of differential calculus we 
have 


of 
fy) — FY = O2-yT | 
Yiy=y 


where (x, y) and (x, ye) are assumed to be in R, and y is a suitable value between y; 
and ys. From this and (3b) it follows that 


(4) If(x, yo) — f yD] = Mlye — yil- 
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It can be shown that (3b) may be replaced by the weaker condition (4), which is known 
as a Lipschitz condition.? However, continuity of f(x, y) is not enough to guarantee the 
uniqueness of the solution. This may be illustrated by the following example. 


Nonuniqueness 
The initial value problem 
yi =Vlyl. (0) =0 


has the two solutions 


P P : { x7/4 if x=0 
y= an ye = 
: : —x7/4 if x <0 


although f(x, y) = V |y| is continuous for all y. The Lipschitz condition (4) is violated in any region that includes 
the line y = 0, because for y; = 0 and positive yg we have 


Vyo 1 
Vy2 


and this can be made as large as we please by choosing yg sufficiently small, whereas (4) requires that the 
quotient on the left side of (5) should not exceed a fixed constant M. ‘| 


If, ya) — fs yd 


lye — yr Jy2 


(Vy2 > 0) 


(5) 


PROBLEM SET 7 


1. 


Linear ODE. If p and r in y’ + p(x)y = r(x) are 
continuous for all x in an interval |x — x9| = a, show 
that f(x, y) in this ODE satisfies the conditions of our 
present theorems, so that a corresponding initial value 
problem has a unique solution. Do you actually need 
these theorems for this ODE? 

Existence? Does the initial value problem 
(x — 2)y’ = y, (2) = L have a solution? Does your 
result contradict our present theorems? 


. Vertical strip. If the assumptions of Theorems | and 


2 are satisfied not merely in a rectangle but in a vertical 
infinite strip |x — x9| < a, in what interval will the 
solution of (1) exist? 


Change of initial condition. What happens in Prob. 
2 if you replace y(2) = 1 with y(2) = k? 

Length of x-interval. In most cases the solution of an 
initial value problem (1) exists in an x-interval larger than 
that guaranteed by the present theorems. Show this fact 
for y’ = 2y?, y(1) = 1 by finding the best possible a 


(choosing b optimally) and comparing the result with the 
actual solution. 


. CAS PROJECT. Picard Iteration. (a) Show that by 


integrating the ODE in (1) and observing the initial 
condition you obtain 


x 


(6) yx) = yo + | ft, yO) dt. 
This form (6) of (1) suggests Picard’s Iteration Method’? 
which is defined by 


(7) YnQ) == 0 ale [Fema dt, ns 1, 2.0% 


Xo 


It gives approximations yy, yo, y3, . . .of the unknown 
solution y of (1). Indeed, you obtain y; by substituting 
y = yo on the right and integrating—this is the first 
step—then ye by substituting y = y; on the right and 
integrating—this is the second step—and so on. Write 


®RUDOLF LIPSCHITZ (1832-1903), German mathematician. Lipschitz and similar conditions are important 
in modern theories, for instance, in partial differential equations. 
lORMILE PICARD (1856-1941). French mathematician, also known for his important contributions to 


complex analysis (see Sec. 16.2 for his famous theorem). Picard used his method to prove Theorems | and 2 
as well as the convergence of the sequence (7) to the solution of (1). In precomputer times, the iteration was of 
little practical value because of the integrations. 
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a program of the iteration that gives a printout of the 
first approximations yo, y1,..., yy as well as their 
graphs on common axes. Try your program on two 
initial value problems of your own choice. 

(b) Apply the iteration to y’ = x + y, y(0) = 0. Also 
solve the problem exactly. 

(c) Apply the iteration to y’ = 2y, y(0) = 1. Also 
solve the problem exactly. 

(d) Find all solutions of y’ = 2Vy, y(1) = 0. Which 
of them does Picard’s iteration approximate? 

(e) Experiment with the conjecture that Picard’s 
iteration converges to the solution of the problem for 
any initial choice of y in the integrand in (7) (leaving 
Yo outside the integral as it is). Begin with a simple ODE 
and see what happens. When you are reasonably sure, 
take a slightly more complicated ODE and give it a try. 


1. Explain the basic concepts ordinary and _ partial 
differential equations (ODEs, PDEs), order, general 
and particular solutions, initial value problems (IVPs). 
Give examples. 

2. What is a linear ODE? Why is it easier to solve than 
a nonlinear ODE? 

3. Does every first-order ODE have a solution? A solution 
formula? Give examples. 

4. What is a direction field? A numeric method for first- 
order ODEs? 

5. What is an exact ODE? Is f(x) dx + g(y) dy = 0 
always exact? 

6. Explain the idea of an integrating factor. Give two 
examples. 

7. What other solution methods did we consider in this 
chapter? 

8. Can an ODE sometimes be solved by several methods? 
Give three examples. 

9. What does modeling mean? Can a CAS solve a model 
given by a first-order ODE? Can a CAS set up a model? 


10. Give problems from mechanics, heat conduction, and 
population dynamics that can be modeled by first-order 
ODEs. 


11-16; DIRECTION FIELD: NUMERIC SOLUTION 


Graph a direction field (by a CAS or by hand) and sketch 
some solution curves. Solve the ODE exactly and compare. 
In Prob. 16 use Euler’s method. 


11. y’ + 2y =0 
12. y)=1-y? 
13. y' = y — 4y? 
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7. Maximum a. What is the largest possible @ in 
Example | in the text? 


8. Lipschitz condition. Show that for a linear ODE 
y’ + p(x)y = r(x) with continuous p and r in 
|x — x9| Sa a Lipschitz condition holds. This is 
remarkable because it means that for a linear ODE the 
continuity of f(x, y) guarantees not only the existence 
but also the uniqueness of the solution of an initial 
value problem. (Of course, this also follows directly 
from (4) in Sec. 1.5.) 

9. Common points. Can two solution curves of the same 
ODE have a common point in a rectangle in which the 
assumptions of the present theorems are satisfied? 

10. Three possible cases. Find all initial conditions such 
that (x? — x)y’ = (2x — 1)yhas no solution, precisely 
one solution, and more than one solution. 
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14. xy’ =y + x 
15. y’ + y = 1.01 cos 10x 


16. Solve y’ =y— y?, y(0) = 0.2 by Euler’s method 
(10 steps, h = 0.1). Solve exactly and compute the error. 


17-21| GENERAL SOLUTION 


Find the general solution. Indicate which method in this 
chapter you are using. Show the details of your work. 

17. y' + 2.5y = 1.6x 

18. y’ — 0.4y = 29 sinx 

19. 25yy’ — 4x =0 

20. y' = ay + by? (a #0) 

21. (Bxe¥ + 2y) dx + (x7e4 + x) dy = 0 


22-26| INITIAL VALUE PROBLEM (IVP) 
Solve the IVP. Indicate the method used. Show the details 
of your work. 
22. y’ + 4xy =e 2", yO) = -4.3 
23. y = Vi -y*, yO) = 1/V2 


24. y’ +ay=y, yO) =3 
25. 3 sec ydx + 3 sec x dy =0, »(0)=0 
26. x sinh ydy = coshydx, y(3)=0 


27-30 | MODELING, APPLICATIONS 


27. Exponential growth. If the growth rate of a culture 
of bacteria is proportional to the number of bacteria 
present and after 1 day is 1.25 times the original 
number, within what interval of time will the number 
of bacteria (a) double, (b) triple? 
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28. Mixing problem. The tank in Fig. 28 contains 80 1b 29, Half-life. If in a reactor, uranium 73/U loses 10% of 


of salt dissolved in 500 gal of water. The inflow per its weight within one day, what is its half-life? How 
minute is 20 lb of salt dissolved in 20 gal of water. The long would it take for 99% of the original amount to 
outflow is 20 gal/min of the uniform mixture. Find the disappear? 


time when the salt content y(t) in the tank reaches 95% 30 


Eee iene . Newton’s law of cooling. A metal bar whose 
of its limiting value (as t— ©), 


temperature is 20°C is placed in boiling water. How 
long does it take to heat the bar to practically 100°C, 


say, to 99.9°C if the temperature of the bar after 1 min 
—- => of heating is 51.5°C? First guess, then calculate. 
—- 


SS 


Fig. 28. Tank in Problem 28 


SUMMARY—OF- CHAPTER ] 


First-Order ODEs 


This chapter concerns ordinary differential equations (ODEs) of first order and 
their applications. These are equations of the form 


(1) F(x, y,y')=0 — orinexplicitform —_y’ = f(x, y) 


involving the derivative y’ = dy/dx of an unknown function y, given functions of 
x, and, perhaps, y itself. If the independent variable x is time, we denote it by ¢. 

In Sec. 1.1 we explained the basic concepts and the process of modeling, that is, 
of expressing a physical or other problem in some mathematical form and solving 
it. Then we discussed the method of direction fields (Sec. 1.2), solution methods 
and models (Secs. 1.3—1.6), and, finally, ideas on existence and uniqueness of 
solutions (Sec. 1.7). 

A first-order ODE usually has a general solution, that is, a solution involving an 
arbitrary constant, which we denote by c. In applications we usually have to find a 
unique solution by determining a value of c from an initial condition y(xo) = yo. 
Together with the ODE this is called an initial value problem 


(2) y =f@y),  y«o =yo (xo, Yo given numbers) 


and its solution is a particular solution of the ODE. Geometrically, a general 

solution represents a family of curves, which can be graphed by using direction 

fields (Sec. 1.2). And each particular solution corresponds to one of these curves. 
A separable ODE is one that we can put into the form 


(3) g(y) dy = f(x) dx (Sec. 1.3) 


by algebraic manipulations (possibly combined with transformations, such as 
y/x = u) and solve by integrating on both sides. 


Summary of Chapter 1 
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An exact ODE is of the form 
(4) M(x, y) dx + N(x, y) dy = 0 (Sec. 1.4) 
where M dx + N dy is the differential 
du = uy dx + uy dy 


of a function u(x, y), so that from du = 0 we immediately get the implicit general 

solution u(x, y) = c. This method extends to nonexact ODEs that can be made exact 

by multiplying them by some function F(x, y,), called an integrating factor (Sec. 1.4). 
Linear ODEs 


(5) y’ + py = re) 


are very important. Their solutions are given by the integral formula (4), Sec. 1.5. 
Certain nonlinear ODEs can be transformed to linear form in terms of new variables. 
This holds for the Bernoulli equation 


y’ + p@y = gay" (Sec. 1.5). 


Applications and modeling are discussed throughout the chapter, in particular in 
Secs. 1.1, 1.3, 1.5 (population dynamics, etc.), and 1.6 (trajectories). 

Picard’s existence and uniqueness theorems are explained in Sec. 1.7 (and 
Picard’s iteration in Problem Set 1.7). 

Numeric methods for first-order ODEs can be studied in Secs. 21.1 and 21.2 
immediately after this chapter, as indicated in the chapter opening. 


CHAPTER 2 


Second-Order Linear ODEs 


Many important applications in mechanical and electrical engineering, as shown in Secs. 
2.4, 2.8, and 2.9, are modeled by linear ordinary differential equations (linear ODEs) of the 
second order. Their theory is representative of all linear ODEs as is seen when compared 
to linear ODEs of third and higher order, respectively. However, the solution formulas for 
second-order linear ODEs are simpler than those of higher order, so it is a natural progression 
to study ODEs of second order first in this chapter and then of higher order in Chap. 3. 

Although ordinary differential equations (ODEs) can be grouped into linear and nonlinear 
ODEs, nonlinear ODEs are difficult to solve in contrast to linear ODEs for which many 
beautiful standard methods exist. 

Chapter 2 includes the derivation of general and particular solutions, the latter in 
connection with initial value problems. 

For those interested in solution methods for Legendre’s, Bessel’s, and the hypergeometric 
equations consult Chap. 5 and for Sturm—Liouville problems Chap. 11. 


COMMENT. Numerics for second-order ODEs can be studied immediately after this 
chapter. See Sec. 21.3, which is independent of other sections in Chaps. 19-21. 


Prerequisite: Chap. 1, in particular, Sec. 1.5. 
Sections that may be omitted in a shorter course: 2.3, 2.9, 2.10. 
References and Answers to Problems: App. | Part A, and App. 2. 


2.1 Homogeneous Linear ODEs of Second Order 
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We have already considered first-order linear ODEs (Sec. 1.5) and shall now define and 
discuss linear ODEs of second order. These equations have important engineering 
applications, especially in connection with mechanical and electrical vibrations (Secs. 2.4, 
2.8, 2.9) as well as in wave motion, heat conduction, and other parts of physics, as we 
shall see in Chap. 12. 

A second-order ODE is called linear if it can be written 


(1) y" + py’ + q@y = rx) 


and nonlinear if it cannot be written in this form. 

The distinctive feature of this equation is that it is linear in y and its derivatives, whereas 
the functions p, qg, and r on the right may be any given functions of x. If the equation 
begins with, say, f(x)y”, then divide by f(x) to have the standard form (1) with y” as the 
first term. 
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The definitions of homogeneous and nonhomogenous second-order linear ODEs are 
very similar to those of first-order ODEs discussed in Sec. 1.5. Indeed, if r(x) = 0 (that 
is, r(x) = 0 for all x considered; read “r(x) is identically zero”), then (1) reduces to 


(2) y” + py’ + gay =0 


and is called homogeneous. If 7(x) # 0, then (1) is called nonhomogeneous. This is 
similar to Sec. 1.5. 
An example of a nonhomogeneous linear ODE is 


y” + 25y = e~*cos x, 


and a homogeneous linear ODE is 
”" , : : ”" 1 , 
xy +y +xy=0, written in standard form yy re 0. 


Finally, an example of a nonlinear ODE is 
yy + y'2 =0. 


The functions p and q in (1) and (2) are called the coefficients of the ODEs. 
Solutions are defined similarly as for first-order ODEs in Chap. 1. A function 


y = h(x) 


is called a solution of a (linear or nonlinear) second-order ODE on some open interval J 
if h is defined and twice differentiable throughout that interval and is such that the ODE 
becomes an identity if we replace the unknown y by h, the derivative y’ by h’, and the 
second derivative y” by h”. Examples are given below. 


Homogeneous Linear ODEs: Superposition Principle 


Sections 2.1—2.6 will be devoted to homogeneous linear ODEs (2) and the remaining 
sections of the chapter to nonhomogeneous linear ODEs. 

Linear ODEs have a rich solution structure. For the homogeneous equation the backbone 
of this structure is the superposition principle or linearity principle, which says that we 
can obtain further solutions from given ones by adding them or by multiplying them with 
any constants. Of course, this is a great advantage of homogeneous linear ODEs. Let us 
first discuss an example. 


Homogeneous Linear ODEs: Superposition of Solutions 
The functions y = cos x and y = sin x are solutions of the homogeneous linear ODE 
y” +y=0 
for all x. We verify this by differentiation and substitution. We obtain (cos x)” = —cos x; hence 


: ” 
y +y=(cosx) + cosx cosx + cosx = 0. 
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EXAMPLE 3 
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Similarly for y = sin x (verify!). We can go an important step further. We multiply cos x by any constant, for 
instance, 4.7, and sinx by, say, —2, and take the sum of the results, claiming that it is a solution. Indeed, 
differentiation and substitution gives 


(4.7 cos x — 2 sinx)” + (4.7 cos x — 2 sin x) 4.7 cosx + 2 sinx + 4.7 cosx — 2 sinx = 0. H 


In this example we have obtained from yj (= cos x) and ye (= sin x) a function of the form 
(3) y = c1y1 + Cove (C1, Cg arbitrary constants). 
This is called a linear combination of y; and yo. In terms of this concept we can now 


formulate the result suggested by our example, often called the superposition principle 
or linearity principle. 


Fundamental Theorem for the Homogeneous Linear ODE (2) 


For a homogeneous linear ODE (2), any linear combination of two solutions on an 
open interval I is again a solution of (2) on I. In particular, for such an equation, 
sums and constant multiples of solutions are again solutions. 


Let y, and yo be solutions of (2) on J. Then by substituting y = cyy, + ceye and 
its derivatives into (2), and using the familiar rule (cy y, + coy)’ = cyy4 + coy9, ete., 
we get 


y” + py’ + ay = (c1y1 + coya)” + plery1 + caye)’ + g(cry1 + caye) 
= cyt + coys + p(cryt + ceys) + g(c1y1 + Cayo) 


ci(y1 + py, + gyi) + coy + pya + ya) = 0, 


since in the last line, (---) = 0 because y, and yo are solutions, by assumption. This shows 
that y is a solution of (2) on J. i 


CAUTION! Don’t forget that this highly important theorem holds for homogeneous 
linear ODEs only but does not hold for nonhomogeneous linear or nonlinear ODEs, as 
the following two examples illustrate. 


A Nonhomogeneous Linear ODE 


Verify by substitution that the functions y = 1 + cos x and y = | + sin x are solutions of the nonhomogeneous 
linear ODE 


y" +y= 1 


but their sum is not a solution. Neither is, for instance, 2(1 + cos x) or 5(1 + sin x). ii 


A Nonlinear ODE 


Verify by substitution that the functions y = x? and y = | are solutions of the nonlinear ODE 
y"y — xy’ = 0, 


but their sum is not a solution. Neither is —x?, so you cannot even multiply by —1! ia] 
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Initial Value Problem. Basis. General Solution 


Recall from Chap. | that for a first-order ODE, an initial value problem consists of the 
ODE and one initial condition y(xo9) = yo. The initial condition is used to determine the 
arbitrary constant c in the general solution of the ODE. This results in a unique solution, 
as we need it in most applications. That solution is called a particular solution of the 
ODE. These ideas extend to second-order ODEs as follows. 

For a second-order homogeneous linear ODE (2) an initial value problem consists of 
(2) and two initial conditions 


(4) y(xo) = Ko, —y' (Xo) = Ki. 


These conditions prescribe given values Kg and K, of the solution and its first derivative 
(the slope of its curve) at the same given x = Xq in the open interval considered. 

The conditions (4) are used to determine the two arbitrary constants c, and cg in a 
general solution 


(5) y = c1y1 + Cayo 


of the ODE; here, y; and ye are suitable solutions of the ODE, with “suitable” to be 
explained after the next example. This results in a unique solution, passing through the 
point (xo, Ko) with Kz as the tangent direction (the slope) at that point. That solution is 
called a particular solution of the ODE (2). 


Initial Value Problem 
Solve the initial value problem 
y" +y=0, yO) =3.0,  y'(0) = -0.5. 


Solution. Step 1. General solution. The functions cos x and sin x are solutions of the ODE (by Example 1), 
and we take 


y = c ,cosx + cg sin x. 


This will turn out to be a general solution as defined below. 


Step 2. Particular solution. We need the derivative y! = —cy sin x + cy cos x. From this and the 
initial values we obtain, since cos 0 = 1 and sin 0 = 0, 


y(0) = cy = 3.0 and y'(0) = cg = —-0.5. 
This gives as the solution of our initial value problem the particular solution 


y = 3.0 cos x — 0.5 sin x. 


Fig. 29. Particular solution 
and initial tangent in Figure 29 shows that at x = 0 it has the value 3.0 and the slope —0.5, so that its tangent intersects 


Example 4 


the x-axis atx = 3.0/0.5 = 6.0 . (The scales on the axes differ!) B 
Observation. Our choice of y; and yg was general enough to satisfy both initial 
conditions. Now let us take instead two proportional solutions yy = cos x and yg = k cos x, 


so that y;/yo = I/k = const. Then we can write y = cyy, + Coy in the form 


y = cz, cosx + co(k cos x) = Ccos x where C = cy + Cok. 
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Hence we are no longer able to satisfy two initial conditions with only one arbitrary 
constant C. Consequently, in defining the concept of a general solution, we must exclude 
proportionality. And we see at the same time why the concept of a general solution is of 
importance in connection with initial value problems. 


General Solution, Basis, Particular Solution 


A general solution of an ODE (2) on an open interval J is a solution (5) in which 
yy and yg are solutions of (2) on / that are not proportional, and c, and cg are arbitrary 
constants. These yj, ye are called a basis (or a fundamental system) of solutions 
of (2) on I. 

A particular solution of (2) on / is obtained if we assign specific values to cy 
and Cs» in (5). 


For the definition of an interval see Sec. 1.1. Furthermore, as usual, y; and yg are called 
proportional on I if for all x on J, 


(6) (a) yy = kyo or (b) ye=)y 


where k and / are numbers, zero or not. (Note that (a) implies (b) if and only if k # 0). 

Actually, we can reformulate our definition of a basis by using a concept of general 
importance. Namely, two functions y; and yg are called linearly independent on an 
interval J where they are defined if 


(7) kyyy(x) + keyo(x) = 0 everywhere on J implies ky = Oandky = 0. 
And y, and yo are called linearly dependent on / if (7) also holds for some constants k 1, 


ky not both zero. Then, if ky # 0 orka #0, we can divide and see that y; and yo are 
proportional, 


In contrast, in the case of linear independence these functions are not proportional because 
then we cannot divide in (7). This gives the following 


Basis (Reformulated) 


A basis of solutions of (2) on an open interval / is a pair of linearly independent 
solutions of (2) on J. 


If the coefficients p and qg of (2) are continuous on some open interval J, then (2) has a 
general solution. It yields the unique solution of any initial value problem (2), (4). It 
includes all solutions of (2) on J; hence (2) has no singular solutions (solutions not 
obtainable from of a general solution; see also Problem Set 1.1). All this will be shown 
in Sec. 2.6. 
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EXAMPLE 5 


EXAMPLE 6 


EXAMPLE 7 


Basis, General Solution, Particular Solution 


cosx and sinx in Example 4 form a basis of solutions of the ODE y” + y = 0 for all x because their 
quotient is cot x # const (or tan x # const). Hence y = cy cos x + cg sin. x is a general solution. The solution 
y = 3.0 cos x — 0.5 sin x of the initial value problem is a particular solution. ia 


Basis, General Solution, Particular Solution 


Verify by substitution that y; = e” and yg = e~” are solutions of the ODE y” — y = 0. Then solve the initial 
value problem 
y 


y"-y=0, y0)=6,  y'O) = -2. 


Solution. (e*)" — e* =0 and (e~*)" — e~* = 0 show that e* and e~” are solutions. They are not 
proportional, e"/e~” = e?” # const. Hence e”, e~* form a basis for all x. We now write down the corresponding 
general solution and its derivative and equate their values at 0 to the given initial conditions, 


y = cye” + ce”, y’ = cye™ — coe, y(0) = cy + cp = 6, y'(0) = cy — co = -2. 


By addition and subtraction, cy = 2, cp = 4, so that the answer is y = 2e” + 4e~*. This is the particular solution 
satisfying the two initial conditions. B 


Find a Basis if One Solution Is Known. 
Reduction of Order 


It happens quite often that one solution can be found by inspection or in some other way. 
Then a second linearly independent solution can be obtained by solving a first-order ODE. 
This is called the method of reduction of order.’ We first show how this method works 
in an example and then in general. 


Reduction of Order if a Solution Is Known. Basis 
Find a basis of solutions of the ODE 
(x? _ xy" yer xy’ he y= 0. 


Solution. Inspection shows that y, = x is a solution because yj = 1 and y] = 0, so that the first term 
vanishes identically and the second and third terms cancel. The idea of the method is to substitute 


Uy # ” ” # 
y = uy, = ux, y =uxt u, y =uxt2u 


into the ODE. This gives 


(x? — x)(u"x + 2u') — x(u'x + u) + ux = 0. 


ux and —xu cancel and we are left with the following ODE, which we divide by x, order, and simplify, 


(x? x)(u"x + 2u') — x2u' = 0, (x? xu” + (x — 2)’ =0. 


This ODE is of first order inv = u’, namely, (x2 — xv! + (x — 2) = 0. Separation of variables and integration 
gives 


d x—2 1 2 -—1 
us _ dx ( Jar In |v| = In |x — 1] — 21In |x| = In be x I 
3) ae | x= 1 x % 


Credited to the great mathematician JOSEPH LOUIS LAGRANGE (1736-1813), who was born in Turin, 
of French extraction, got his first professorship when he was 19 (at the Military Academy of Turin), became 
director of the mathematical section of the Berlin Academy in 1766, and moved to Paris in 1787. His important 
major work was in the calculus of variations, celestial mechanics, general mechanics (Mécanique analytique, 
Paris, 1788), differential equations, approximation theory, algebra, and number theory. 
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We need no constant of integration because we want to obtain a particular solution; similarly in the next 
integration. Taking exponents and integrating again, we obtain 


e= 1 1 1 1 
v } z u [v dx = In |x| +-, hence yo = ux =x In |x| + 1. 
x Ry <x % 
Since y, = x and yg = x In |x| + 1 are linearly independent (their quotient is not constant), we have obtained 
a basis of solutions, valid for all positive x. fe 


In this example we applied reduction of order to a homogeneous linear ODE [see (2)] 


y” + pQdy’ + q@y = 0. 


Note that we now take the ODE in standard form, with y”, not f(x)y”—this is essential 
in applying our subsequent formulas. We assume a solution y, of (2), on an open interval 
I, to be known and want to find a basis. For this we need a second linearly independent 
solution yo of (2) on J. To get ya, we substitute 


WN 


y=y=uy, y =y2=u'y ty, y" = yg = uy, + Qu'y, + wy! 
into (2). This gives 
(8) uy, + 2u'yy + uyt + plu’y + uyt) + quyi = 0. 
Collecting terms in u", u’, and u, we have 
uy, + u'(2yi + pyi) + uO + pyi + gyi) = 0. 
Now comes the main point. Since y, is a solution of (2), the expression in the last 


parentheses is zero. Hence wu is gone, and we are left with an ODE in uv’ and u”. We divide 
this remaining ODE by yj and set u' = U, u” =U 4 


2y, + 2y1 
u" tu! 22h = 0, thus u + (2+ p)u=0. 


This is the desired first-order ODE, the reduced ODE. Separation of variables and 
integration gives 


2y1 
W(t pa and inlul=-2In il ~ [oa 


(9) 2 


Here U = u’, so that u = JU dx. Hence the desired second solution is 


ye= y= yi [Uae 


The quotient y2/yy = u = fU dx cannot be constant (since U > 0), so that yy and yg form 
a basis of solutions. 
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REDUCTION OF ORDER is important because it 
gives a simpler ODE. A general second-order ODE 
F(x, y, y’, y") = 0, linear or not, can be reduced to first 
order if y does not occur explicitly (Prob. 1) or if x does not 
occur explicitly (Prob. 2) or if the ODE is homogeneous 


linear and we know a solution (see the text). 
1. Reduction. Show that F(x,y',y")=0 can be 


reduced to first order in z = y’ (from which y follows 
by integration). Give two examples of your own. 


2. Reduction. Show that F(y,y',y”)=0 can be 
reduced to a first-order ODE with y as the independent 
variable and y” = (dz/dy)z, where z = y’; derive this 
by the chain rule. Give two examples. 


REDUCTION OF ORDER 


Reduce to first order and solve, showing each step in detail. 


3,.y +y = 

4, 2xy” = 3y’ 

5. yy” = 3y” 

6. xy” + 2y' + xy =0, yy = (cos x/x 
7 y" +ysiny =0 

8 y"=1+y? 

9. x7y" Sxy' + 9y =0, yy x3 


10. y’ +(1 + ly’? =0 


APPLICATIONS OF REDUCIBLE ODEs 


11. Curve. Find the curve through the origin in the 
xy-plane which satisfies y” = 2y’ and whose tangent 
at the origin has slope 1. 

12. Hanging cable. It can be shown that the curve y(x) 
of an inextensible flexible homogeneous cable hanging 
between two fixed points is obtained by solving 


PROBLEM SET 271 


y” =kV1 + y’”, where the constant k depends on the 
weight. This curve is called catenary (from Latin 
catena = the chain). Find and graph y(x), assuming that 
k = 1 and those fixed points are (—1, 0) and (1, 0) in 
a vertical xy-plane. 

13. Motion. If, in the motion of a small body on a 
straight line, the sum of velocity and acceleration equals 
a positive constant, how will the distance y(t) depend 
on the initial velocity and position? 

14. Motion. In a straight-line motion, let the velocity be 
the reciprocal of the acceleration. Find the distance y(f) 
for arbitrary initial position and velocity. 


15-19 


GENERAL SOLUTION. INITIAL VALUE 
PROBLEM (IVP) 


(More in the next set.) (a) Verify that the given functions 
are linearly independent and form a basis of solutions of 
the given ODE. (b) Solve the IVP. Graph or sketch the 
solution. 
15. 4y"” + 25y =0, y(0) = 3.0, y’(0) = —2.5, 

cos 2.5x, sin 2.5x 


16. y” + 0.6y’ + 0.09y = 0, yO) = 2.2, y’(0) = 0.14, 


é ae xe7 03% 


17. 4x2y" — 3y =0, yl) =-3, y'(1) =0, 


3/2 1/2 
18. x2y" — xy’ +y=0, yd) =43, y'(l) =0.5, 
x,xInx 


19, y” + 2y’+ 2y=0, y(0)=0, y'(0) = 15, 
e“cosx,e” sinx 

20. CAS PROJECT. Linear Independence. Write a 
program for testing linear independence and depen- 
dence. Try it out on some of the problems in this and 
the next problem set and on examples of your own. 


2.2 Homogeneous Linear ODEs 
with Constant Coefficients 


We shall now consider second-order homogeneous linear ODEs whose coefficients a and 


b are constant, 


(1) 


y”" + ay’ + by =0. 


These equations have important applications in mechanical and electrical vibrations, as 
we shall see in Secs. 2.4, 2.8, and 2.9. 
To solve (1), we recall from Sec. 1.5 that the solution of the first-order linear ODE with 


a constant coefficient k 


y +ky=0 
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is an exponential function y = ce". This gives us the idea to try as a solution of (1) the 
function 


(2) poe", 
Substituting (2) and its derivatives 
y’ = Aer” and y 
into our equation (1), we obtain 
(A2 + ad + bye*” = 0. 
Hence if A is a solution of the important characteristic equation (or auxiliary equation) 


(3) M+ ak t+ b=0 


then the exponential function (2) is a solution of the ODE (1). Now from algebra we recall 
that the roots of this quadratic equation (3) are 


(4) Ay =3Ca + Va? — 4b), dg = 3(-a —- Va? — 40). 
(3) and (4) will be basic because our derivation shows that the functions 


Aox 


(5) y= ear and yo =e 


are solutions of (1). Verify this by substituting (5) into (1). 
From algebra we further know that the quadratic equation (3) may have three kinds of 
roots, depending on the sign of the discriminant a” — 4b, namely, 


(Case I) Two real roots if a” — AD () 
(Case I) A real double root if a” — 4b = 0, 
(Case IM) Complex conjugate roots if a2 — 4b < 0. 


Case |. Two Distinct Real-Roots A, and A, 


In this case, a basis of solutions of (1) on any interval is 


Ayu 


yy=e and Yo = 2” 


because y; and yg are defined (and real) for all x and their quotient is not constant. The 
corresponding general solution is 


(6) y = cye™™ + cge??”. 
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EXAMPLE=1 


EXAMPLE 2 


General Solution in the Case of Distinct Real Roots 


We can now solve y” — y= 0 in Example 6 of Sec. 2.1 systematically. The characteristic equation is 
d? — 1 = 0. Its roots are A, = 1 and Ag = —1. Hence a basis of solutions is e” and e~* and gives the same 
general solution as before, 


y = cye” + coe”. H 


Initial Value Problem in the Case of Distinct Real Roots 
Solve the initial value problem 
yi ty —2y=0, yO)=4, yO) = -5. 
Solution. Step 1. General solution. The characteristic equation is 
M+A-2=0. 

Its roots are 

Ay=3(-1+ V9)=1 and = Ag =3(-1 - V9) = -2 
so that we obtain the general solution 

y = ce” + ce”. 


Step 2. Particular solution. Since y'(x) = cye” — 2cge~2", we obtain from the general solution and the initial 
conditions 


y(0) = cy + cg = 4, 
y'(0) = cy — 2cg = —5. 


Hence c; = 1 and cg = 3. This gives the answer y = e* + 3e 2”, Figure 30 shows that the curve begins at 
y = 4 with a negative slope (—5, but note that the axes have different scales!), in agreement with the initial 
conditions. B 


Oo NM FF DD O8 
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Fig. 30. Solution in Example 2 


Case II. Real Double Root A = —a/2 


If the discriminant a? — 4b is zero, we see directly from (4) that we get only one root, 
A = Ay = Ag = —a/2, hence only one solution, 


y= eo Ux 


To obtain a second independent solution yo (needed for a basis), we use the method of 
reduction of order discussed in the last section, setting yo = uy ,. Substituting this and its 


derivatives y5 = u'y, + uy} and y$ into (1), we first have 


(u"y, + Qu’ yy + uy4) + a(u' yy + uy) + buy, = 0. 
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EXAMPLE 4 
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Collecting terms in u", u’, and u, as in the last section, we obtain 
u"yy + u'(2y, + ayy) + u(t + ayy + byy) = 0. 


The expression in the last parentheses is zero, since y, is a solution of (1). The expression 
in the first parentheses is zero, too, since 


dy, = ge" = —ay}. 


We are thus left with w”y, = 0. Hence uw” = 0. By two integrations, u = cyx + cy. To 
get a second independent solution yg = uy,, we can simply choose cy = 1, cg = O and 
take u = x. Then yg = xy ,. Since these solutions are not proportional, they form a basis. 
Hence in the case of a double root of (3) a basis of solutions of (1) on any interval is 


ge we. xe 2. 


The corresponding general solution is 
(7) y = (1 + exe”. 


WARNING! If Aisa simple root of (4), then (cy + cox)e*” with co # 0 is not a solution 
of (1). 


General Solution in the Case of a Double Root 


The characteristic equation of the ODE y” + 6y’ + 9y = Ois dM? + 6A +9 = (A + 3)? = O. It has the double 
root A = —3. Hence a basis is e~®” and xe~>”. The corresponding general solution is y = (cy + cxe > Hi 


Initial Value Problem in the Case of a Double Root 


Solve the initial value problem 


y" +y' +0.25y=0, (0) = 3.0, y'(0) = 3.5. 


Solution. The characteristic equation is A + A + 0.25 = (A + 0.5) = 0. It has the double root A = —0.5. 
This gives the general solution 


y = (cy + coxye 9. 


We need its derivative 


y’ = cge 9 — 0.5(cy + cox)e 9, 
From this and the initial conditions we obtain 


yO) = cy = 3.0, y'(0) = cg — 0.5e, = 3.5; hence Cg = —2. 


The particular solution of the initial value problem is y = (3 — 2x)e 9”, See Fig. 31. | 
a 
3 
2 
1 
e) | | | 
10 12 14 % 


Fig. 31. Solution in Example 4 
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EXAMPLE 5 


EXAMPLE 6 


1 ’ 1 ; 
Case Ill. Complex Roots —3a + iw and —5a — iw 


This case occurs if the discriminant a — 4b of the characteristic equation (3) is negative. 
In this case, the roots of (3) are the complex A = — sa + iw that give the complex solutions 
of the ODE (1). However, we will show that we can obtain a basis of real solutions 


(8) yy = e 7? cos wx, yo = e Ml? sin wx (w > 0) 


where w” = b — 4a”. It can be verified by substitution that these are solutions in the 
present case. We shall derive them systematically after the two examples by using the 
complex exponential function. They form a basis on any interval since their quotient 
cot wx is not constant. Hence a real general solution in Case III is 


(9) y = e 2 (4 cos wx + B sin wx) (A, B arbitrary). 


Complex Roots. Initial Value Problem 


Solve the initial value problem 
y" + 0.4y' + 9.04y=0, y0)=0,  y'(0) =3. 


Solution. Step 1. General solution. The characteristic equation is d? + 0.4A + 9.04 = 0. It has the roots 
—0.2 + 3i. Hence w = 3, and a general solution (9) is 


y = e "(4 cos 3x + B sin 3x). 


Step 2. Particular solution. The first initial condition gives y(0) = A = 0. The remaining expression is 
y= Be~°?* sin 3x. We need the derivative (chain rule!) 


y’ = B(-0.2e7°* sin 3x + 3e~°-2” cos 3x). 


From this and the second initial condition we obtain y’(0) = 3B = 3. Hence B = 1. Our solution is 


y =e 9 sin 3x, 


Figure 32 shows y and the curves of e~°?* and —e~°?* (dashed), between which the curve of y oscillates. 
Such “damped vibrations” (with x = ¢ being time) have important mechanical and electrical applications, as we 
shall soon see (in Sec. 2.4). a 


Fig. 32. Solution in Example 5 


Complex Roots 


A general solution of the ODE 


y" + wy =0 (w constant, not zero) 


y =Acoswx + Bsin wx. 


With w = 1 this confirms Example 4 in Sec. 2.1. a 
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Summary of Cases I-III 


Case Roots of (2) Basis of (1) General Solution of (1) 
I sols ad edt, prot y= cyes® + ce 
Il aa eet eo at/2 xe 08/2 y = (cy + cox) eo o/2 
Complex conjugate _an/2 
08 dy = —da + io, ée COS OX y = e */2(4 cos wx + B sin wx) 
do = —3a — iw e 2 sin wx 


It is very interesting that in applications to mechanical systems or electrical circuits, 
these three cases correspond to three different forms of motion or flows of current, 
respectively. We shall discuss this basic relation between theory and practice in detail in 
Sec. 2.4 (and again in Sec. 2.8). 


Derivation in Case Ill. Complex Exponential Function 


If verification of the solutions in (8) satisfies you, skip the systematic derivation of these 
real solutions from the complex solutions by means of the complex exponential function 
e* of a complex variable z = r + it. We write r + it, not x + iy because x and y occur 
in the ODE. The definition of e* in terms of the real functions e”, cos ¢, and sin ¢ is 


(10) e = e TH = ee" = o(cost + isind. 


This is motivated as follows. For real z = r, hence t = 0, cos 0 = 1, sin0 = 0, we get 
the real exponential function e”. It can be shown that e*!**? = e*e*?, just as in real. (Proof 
in Sec. 13.5.) Finally, if we use the Maclaurin series of e* with z = it as well as 
i= 1, i? = 1, i* = 1, etc., and reorder the terms as shown (this is permissible, as 


can be proved), we obtain the series 


Pe) 3 «4 it) ® 
i OO, wo 


e*=1+it+ 
2! 3! 4! 5! 
2 4 3 5 
ft t ‘4 
1 + + ti(r- O45 -4 ) 
2! 4! 3! ~=5! 


= cost + isint. 


(Look up these real series in your calculus book if necessary.) We see that we have obtained 
the formula 


(11) e” = cost + isint, 


called the Euler formula. Multiplication by e” gives (10). 
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For later use we note that e~* = cos (—f) + isin (—f) = cost — isint, so that by 


addition and subtraction of this and (11), 
: - 1.3 a 
(12) cost =s(e +e"), sint = ae —e%), 
i 


After these comments on the definition (10), let us now turn to Case III. 
In Case III the radicand a” — 4b in (4) is negative. Hence 4b — a” is positive and, 
using V—1 = i, we obtain in (4) 


A/a" — 4b = 3\V/—(4b — a®) = V—(b — 4a?) = iVb — ba? = iw 


with w defined as in (8). Hence in (4), 


A= sa + iw and, similarly, dg = sa — io. 
Using (10) with r = sax and t = wx, we thus obtain 
eh? = eg GlDat tox — o~GIDX cog wx + i sin wx) 


Age — g(Cal2a— tox — ——(Gl2Xcog x — i sin wx). 


e 
We now add these two lines and multiply the result by 5. This gives y, as in (8). Then 
we subtract the second line from the first and multiply the result by 1/(27). This gives yo 
as in (8). These results obtained by addition and multiplication by constants are again 
solutions, as follows from the superposition principle in Sec. 2.1. This concludes the 
derivation of these real solutions in Case II. 


PROBLEM SET 2-2 


1-15| GENERAL SOLUTION 14, y"” + 2k2y' + k4y =0 
Find a general solution. Check your answer by substitution. 15. y” + 0.54y’ + (0.0729 + m)y = 0 


ODEs of this kind have important applications to be 16-20] FIND AN ODE 


discussed in Secs. 2.4, 2.7, and 2.9. 
1, 4y" — 25y = 0 y" + ay’ + by = 0 for the given basis. 


2. y” + 36y = 0 16. e268, 43% 17. e| ise ai 
3. y"” + 6y’ + 8.96y = 0 18. cos 27x, sin 27x 19, eA te QAO 
4. y" + dy’ + (72 + Ay =0 20. e731” cos 2.1.x, e731” sin 2.1.x 
”" , 2. 

5. y + 2my + my =0 21-30} INITIAL VALUES PROBLEMS 
6. 10y” — 32y’ + 25.6y = 0 

eM y “oy Solve the IVP. Check that your answer satisfies the ODE as 
Ty" + ASy' =0 well as the initial conditions. Show the details of your work. 
8. y” + y' + 3.25y = 0 21. y” + 25y =0, y(0) = 4.6, yO) = -1.2 

9. y” + 1.8y' — 2.08y = 0 22. The ODE in Prob. 4, y(@) = 1, y'(@) = -2 
10. 100y” + 240y’ + (19677? + 144)y = 0 23. y" + y' — 6y =0, yO) =10, y’(0)=0 
11. 4y” — 4y’ — 3y =0 24. 4y"”—4y’ —3y=0, y(—-2) =e, y'(-2) = —e/2 
12. y” + 9y’ + 20y =0 25. y" —y=0, yO) =2, y'(0) = -2 


13. 9y" — 30y’ + 25y =0 26. y" —k*y=0(k #0), yO)=1, yO) =1 
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27. The ODE in Prob. 5, 
y(0) = 4.5, y'(0) = —4.5a7 — 1 = 13.137 
28. 8y"” — 2y’ -—y=0, yO) = 0.2, y'(0) = —0.325 
29. The ODE in Prob. 15, y(0)=0, y'(0)=1 
30. Sy” — 30y’ + 25y = 0, y(0) = 3.3, y’(0) = 10.0 


31-36 | LINEAR INDEPENDENCE is of basic impor- 
tance, in this chapter, in connection with general solutions, 


as explained in the text. Are the following functions linearly 

independent on the given interval? Show the details of your 

work. 

31. e!, xe, any interval 

32... ee, x0 

33. x7,x7Inx, x>1 

34. In x, In (x), x>1 

35. sin 2x,cosxsinx, x <0 

36. e~” cos 3X, 0, -lsSx=1 

37. Instability. Solve y” — y = 0 for the initial conditions 
y(0) = 1, y'(0) = —1. Then change the initial conditions 
to y(O) = 1.001, y’(0) = —0.999 and explain why this 
small change of 0.001 at t = O causes a large change later, 


2.3 Differential Operators. 


38. 


e.g., 22 at t = 10. This is instability: a small initial 
difference in setting a quantity (a current, for in- 
stance) becomes larger and larger with time ¢. This is 
undesirable. 


TEAM PROJECT. General Properties of Solutions 


(a) Coefficient formulas. Show how a and b in (1) 
can be expressed in terms of A; and Ag. Explain how 
these formulas can be used in constructing equations 
for given bases. 

(b) Root zero. Solve y” + 4y’ = 0 (i) by the present 
method, and (ii) by reduction to first order. Can you 
explain why the result must be the same in both 
cases? Can you do the same for a general ODE 
y" + ay’ =0? 

(c) Double root. Verify directly that xe*” with A = 
—a/2 is a solution of (1) in the case of a double root. 
Verify and explain why y = e 2” is a solution of 


” 2m - 


y" —y' — 6y = 0 but xe is not. 

(d) Limits. Double roots should be limiting cases of 
distinct roots Az, Ag as, say, Ag — Ay. Experiment with 
this idea. (Remember |’ H6pital’s rule from calculus.) 
Can you arrive at xe“? Give it a try. 


Optional 


This short section can be omitted without interrupting the flow of ideas. It will not be 
used subsequently, except for the notations Dy, Dy, etc. to stand for y’, y”, ete. 
Operational calculus means the technique and application of operators. Here, an 
operator is a transformation that transforms a function into another function. Hence 
differential calculus involves an operator, the differential operator D, which 
transforms a (differentiable) function into its derivative. In operator notation we write 


D =< and 


() 


ia 


a dx’ 


Similarly, for the higher derivatives we write D?y = D(Dy) = y",and so on. For example, 


D sin = cos, D? sin = —sin, etc. 


For a homogeneous linear ODE y” + ay’ + by = 0 with constant coefficients we can 
now introduce the second-order differential operator 


L = P(D) = D? + aD + DI, 


where / is the identity operator defined by Jy = y. Then we can write that ODE as 


(2) 


Ly = P(D)y = (D2 + aD + bly = 0. 
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EXAMPLE=-1 


P suggests “polynomial.” L is a linear operator. By definition this means that if Ly and 
Lw exist (this is the case if y and w are twice differentiable), then L(cy + kw) exists for 
any constants c and k, and 


L(cy + kw) = cLy + kLw. 


Let us show that from (2) we reach agreement with the results in Sec. 2.2. Since 
(De*)(x) = Ae*” and (D7e*)(x) = A7e*”, we obtain 


e Le*(x) = P(D)e*(x) = (D? + aD + bDe*(x) 
= (A? + ad + be = PiAje™ = 0. 


This confirms our result of Sec. 2.2 that e*” is a solution of the ODE (2) if and only if A 
is a solution of the characteristic equation P(A) = 0. 

P(A) is a polynomial in the usual sense of algebra. If we replace A by the operator D, 
we obtain the “operator polynomial” P(D). The point of this operational calculus is that 
P(D) can be treated just like an algebraic quantity. In particular, we can factor it. 


Factorization, Solution of an ODE 


Factor P(D) = D? — 3D — 40/ and solve P(D)y = 0. 


Solution. D — 3D — 40] = (D — 8I)(D + 51) because I? = I. Now (D — 8/)y = y’ — 8y = 0 has the 
solution yy = ee. Similarly, the solution of (D + S5/I)y = 0 is yo = e~°”. This is a basis of P(D)y = 0 on any 
interval. From the factorization we obtain the ODE, as expected, 


(D — 8I\(D + SI)y = (D — 81)Q" + Sy) = DQ" + Sy) — 80" + Sy) 


y” + 5y’ — By’ — 40y = y” — 3’ — 40y = 0. 


Verify that this agrees with the result of our method in Sec. 2.2. This is not unexpected because we factored 
P(D) in the same way as the characteristic polynomial P(A) = A2 — 3A — 40. @ 


It was essential that L in (2) had constant coefficients. Extension of operator methods to 
variable-coefficient ODEs is more difficult and will not be considered here. 

If operational methods were limited to the simple situations illustrated in this section, 
it would perhaps not be worth mentioning. Actually, the power of the operator approach 
appears in more complicated engineering problems, as we shall see in Chap. 6. 


PROBLEM SET 2-3 


1-5| APPLICATION OF DIFFERENTIAL 6-12} GENERAL SOLUTION 
OPERATORS Factor as in the text and solve. 

Apply the given operator to the given functions. Show all 6. (D? + 4.00D + 3.361)y = 0 
steps in detail. 7. (4D? — Dy = 0 

1. D? + 2D; cosh 2x, e~” + e?”, cos x 8. (D? + 3Dy =0 

2. D— 31; 3x7 + 3x, 3e*”, cos 4x — sin 4x 9. (D? — 4.20D + 4.41Dy = 0 
3. = oy: eee 10. (D2 + 4.80D + 5.76ly = 0 
4. (D + 61)"; 6x + sin 6x, xe & 11. (D? — 4.00D + 3.84Dy = 0 
5. (D — 21D + 31); e2”, xe?®, e7 3” 12. (D? + 3.0D + 2.5I)y = 0 
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13. Linear operator. Illustrate the linearity of L in (2) by 
taking c= 4,k = —6,y = e?", and w = cos 2k. 
Prove that L is linear. 

14. Double root. If D? + aD + bI has distinct roots 
mw and A, show that a particular solution is 
y = (e* — e*”)/(u — A). Obtain from this a solution 
xe*” by letting w — A and applying I’Hopital’s rule. 


15. 


Definition of linearity. Show that the definition of 
linearity in the text is equivalent to the following. If 
L[y] and L[w] exist, then L[y + w] exists and L[cy] 
and L[kw] exist for all constants c and k, and 
LLy + w] = L[Ly] + L[w] as well as Licy] = cL[y] 
and L[kw] = kL[w]. 


2.4 Modeling of Free Oscillations 
of a Mass—Spring System 


Linear ODEs with constant coefficients have important applications in mechanics, as we 
show in this section as well as in Sec. 2.8, and in electrical circuits as we show in Sec. 2.9. 
In this section we model and solve a basic mechanical system consisting of a mass on an 
elastic spring (a so-called “mass-—spring system,” Fig. 33), which moves up and down. 


Setting Up the Model 


We take an ordinary coil spring that resists extension as well as compression. We suspend 
it vertically from a fixed support and attach a body at its lower end, for instance, an iron 
ball, as shown in Fig. 33. We let y = 0 denote the position of the ball when the system 
is at rest (Fig. 33b). Furthermore, we choose the downward direction as positive, thus 
regarding downward forces as positive and upward forces as negative. 


spring 


(y = 0) . 
} 
Systemat = —+---- 


rest 


System in 
motion 


(a) (b) (c) 


Fig. 33. Mechanical mass—spring system 


We now let the ball move, as follows. We pull it down by an amount y > 0 (Fig. 33c). 
This causes a spring force 


(1) Fy = —ky (Hooke’s law”) 


proportional to the stretch y, with k (>0) called the spring constant. The minus sign 
indicates that F'; points upward, against the displacement. It is a restoring force: It wants 
to restore the system, that is, to pull it back to y = 0. Stiff springs have large k. 


?ROBERT HOOKE (1635-1703), English physicist, a forerunner of Newton with respect to the law of 
gravitation. 
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Note that an additional force —Fo is present in the spring, caused by stretching it in 
fastening the ball, but Fg has no effect on the motion because it is in equilibrium with 
the weight W of the ball, —Fy = W = mg, where g = 980 cm/sec” = 9.8 m/sec” = 
32.17 ft/ sec” is the constant of gravity at the Earth’s surface (not to be confused with 
the universal gravitational constant G = gR?/M = 6.67 « 107"! nt m?/kg”, which we 
shall not need; here R = 6.37 - 10° m and M = 5.98 - 1074 kg are the Earth’s radius and 
mass, respectively). 

The motion of our mass—spring system is determined by Newton’s second law 


(2) Mass X Acceleration = my” = Force 


where y” = d 2\,/ dt” and “Force” is the resultant of all the forces acting on the ball. (For 
systems of units, see the inside of the front cover.) 


ODE of the Undamped System 


Every system has damping. Otherwise it would keep moving forever. But if the damping 
is small and the motion of the system is considered over a relatively short time, we 


may disregard damping. Then Newton’s law with F = —F, gives the model 
my” = —F, = —ky; thus 
(3) my” + ky =0. 


This is a homogeneous linear ODE with constant coefficients. A general solution is 
obtained as in Sec. 2.2, namely (see Example 6 in Sec. 2.2) 


3|> 


(4) y(t) = Acos wot + B sin wot 9 = 


This motion is called a harmonic oscillation (Fig. 34). Its frequency is f = wo/27 Hertz? 
(= cycles/sec) because cos and sin in (4) have the period 277/wg. The frequency fis called 
the natural frequency of the system. (We write wo to reserve w for Sec. 2.8.) 


» 


@ Positive 
@ Zero Initial velocity 
@ Negative 
Fig. 34. Typical harmonic oscillations (4) and (4*) with the same y(0) = A and 
different initial velocities y’(0) = woB, positive (1), zero (2), negative G) 


3HEINRICH HERTZ (1857-1894), German physicist, who discovered electromagnetic waves, as the basis 
of wireless communication developed by GUGLIELMO MARCONI (1874-1937), Italian physicist (Nobel prize 
in 1909). 
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EXAMPLE 1 


Spring 


Fig. 36. 
Damped system 


CHAP. 2. Second-Order Linear ODEs 


An alternative representation of (4), which shows the physical characteristics of amplitude 
and phase shift of (4), is 


(4*) y(t) = C cos (wot — 8) 


with C = VA? + B? and phase angle 5, where tan 6 = B/A. This follows from the 
addition formula (6) in App. 3.1. 


Harmonic Oscillation of an Undamped Mass—Spring System 


If a mass—spring system with an iron ball of weight W = 98 nt (about 22 Ib) can be regarded as undamped, and 
the spring is such that the ball stretches it 1.09 m (about 43 in.), how many cycles per minute will the system 
execute? What will its motion be if we pull the ball down from rest by 16 cm (about 6 in.) and let it start with 
zero initial velocity? 


Solution. Hooke’s law (1) with W as the force and 1.09 meter as the stretch gives W = 1.09k; thus 
k = W/1.09 = 98/1.09 = 90 [kg/sec”] = 90 [nt/meter]. The mass is m = W/g = 98/9.8 = 10 [kg]. This 
gives the frequency w9/(277) = Vk/m/(277) = 3/(277) = 0.48 [Hz] = 29 [cycles/min]. 

From (4) and the initial conditions, y(0) = A = 0.16 [meter] and y’(0) = woB = 0. Hence the motion is 


y(t) = 0.16 cos 3¢ [meter] or 0.52 cos 3t [ft] (Fig. 35). 


If you have a chance of experimenting with a mass—spring system, don’t miss it. You will be surprised about 
the good agreement between theory and experiment, usually within a fraction of one percent if you measure 


carefully. ai] 

y 

O.2-F 

0.1 \ 
0 l l l 

t 
ac 7) 2 4 6 8 0 
0.2 


Fig. 35. Harmonic oscillation in Example 1 


ODE of the Damped System 


To our model my” = —ky we now add a damping force 
Fy = -cy’, 
obtaining my” = —ky — cy’; thus the ODE of the damped mass-spring system is 


(5) my” + cy’ + ky =0. (Fig. 36) 


Physically this can be done by connecting the ball to a dashpot; see Fig. 36. We assume 
this damping force to be proportional to the velocity y’ = dy/dt. This is generally a good 
approximation for small velocities. 
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The constant c is called the damping constant. Let us show that c is positive. Indeed, 
the damping force Fy = —cy’ acts against the motion; hence for a downward motion we 
have y’ > 0 which for positive c makes F negative (an upward force), as it should be. 
Similarly, for an upward motion we have y’ < 0 which, for c > 0 makes F2 positive (a 
downward force). 


The ODE (5) is homogeneous linear and has constant coefficients. Hence we can solve 
it by the method in Sec. 2.2. The characteristic equation is (divide (5) by m) 


k 


+2 y+ 2 =0. 
m m 
By the usual formula for the roots of a quadratic equation we obtain, as in Sec. 2.2, 


1 
(6) Ay=—-a+B, Ag = —a-— B, where a=— and B= —Vc?2 — 4mk. 
2m 2m 


It is now interesting that depending on the amount of damping present—whether a lot of 
damping, a medium amount of damping or little damping—three types of motions occur, 


respectively: 
CaseI.  c? > 4mk. Distinct real roots dy, Ag. (Overdamping) 
Case II. c? = 4mk. A real double root. (Critical damping) 
Case III. c? < 4mk. Complex conjugate roots. (Underdamping) 


They correspond to the three Cases I, I, Il in Sec. 2.2. 


Discussion of the Three Cases 
Case |. Overdamping 


If the damping constant c is so large that c? > 4mk, then 1 and Ag are distinct real roots. 
In this case the corresponding general solution of (5) is 


(7) y(t) = ee 7 PE ee OR" 


We see that in this case, damping takes out energy so quickly that the body does not 
oscillate. For t > 0 both exponents in (7) are negative because a > 0, B > 0, and 
B =? - k/m < a”. Hence both terms in (7) approach zero as t—>~%™. Practically 
speaking, after a sufficiently long time the mass will be at rest at the static equilibrium 
position (y = 0). Figure 37 shows (7) for some typical initial conditions. 
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(a) (b) 


© Positive 
@ Zero Initial velocity 
© Negative 
Fig. 37. Typical motions (7) in the overdamped case 
(a) Positive initial displacement 
(b) Negative initial displacement 


Case Il. Critical Damping 


Critical damping is the border case between nonoscillatory motions (Case J) and oscillations 
(Case II). It occurs if the characteristic equation has a double root, that is, if C= A4mk, 
so that 8 = 0, Ay = Ag = —a. Then the corresponding general solution of (5) is 


(8) y(t) = (cy + cof)e™™. 


at 


This solution can pass through the equilibrium position y = O at most once because e~ 
is never zero and cy + Cyt can have at most one positive zero. If both c, and cg are positive 
(or both negative), it has no positive zero, so that y does not pass through 0 at all. Figure 38 
shows typical forms of (8). Note that they look almost like those in the previous figure. 


©@ Positive | 
@ Zero Initial velocity 
@ Negative J 


Fig. 38. Critical damping [see (8)] 
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EXAMPLE 2 


Case III. Underdamping 


This is the most interesting case. It occurs if the damping constant c is so small that 
c? < 4mk. Then BG in (6) is no longer real but pure imaginary, say, 


1 bo 
(9) B = iw* where o* = oma Vamk — c2 = rs ra (>0). 


(We now write w* to reserve w for driving and electromotive forces in Secs. 2.8 and 2.9.) 
The roots of the characteristic equation are now complex conjugates, 


Ay = -a st iw*, Ag = —a — iw* 
with a = c/(2m), as given in (6). Hence the corresponding general solution is 


(10) y(t) = e “(A cos w*t + B sin w*t) = Ce~™ cos (w*t — 8) 


where C2 = A? + Band tan 6 = B/A, as in (4*). 

This represents damped oscillations. Their curve lies between the dashed curves 
y = Ce“ and y = —Ce~ in Fig. 39, touching them when w*t — 6 is an integer multiple 
of 77 because these are the points at which cos (w*t — 6) equals 1 or —1. 

The frequency is w*/(27r) Hz (hertz, cycles/sec). From (9) we see that the smaller 
c (> 0) is, the larger is w* and the more rapid the oscillations become. If c approaches 0, 
then w* approaches w) = Vk/m, giving the harmonic oscillation (4), whose frequency 
@o/(277) is the natural frequency of the system. 


Fig. 39. Damped oscillation in Case III [see (10)] 


The Three Cases of Damped Motion 


How does the motion in Example 1 change if we change the damping constant c from one to another of the 
following three values, with y(0) = 0.16 and y’(0) = 0 as before? 


(I) c = 100 kg/sec, (Il) c = 60 kg/sec, (II) c = 10 kg/sec. 


Solution. It is interesting to see how the behavior of the system changes due to the effect of the damping, 
which takes energy from the system, so that the oscillations decrease in amplitude (Case III) or even disappear 
(Cases II and I). 

(D With m = 10 and k = 90, as in Example 1, the model is the initial value problem 


10y” + 100y’ + 90y = 0, — y(0) = 0.16 [meter], y'(0) = 0. 
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The characteristic equation is 10A7 + 100A + 90 = 10(A 4 9\A + 1) = 0. It has the roots —9 and —1. This 
gives the general solution 


t 


y= cye + coe". a 


t 


We also need y’ = —9cye7** — ce. 


The initial conditions give cy + cg = 0.16, —9c, — cg = 0. The solution is cy = —0.02, co = 0.18. Hence in 
the overdamped case the solution is 


y = —0.02e~-* + 0.18277. 


It approaches 0 as t—> ~. The approach is rapid; after a few seconds the solution is practically 0, that is, the 
iron ball is at rest. 

(I) The model is as before, with c = 60 instead of 100. The characteristic equation now has the form 
10A2 + 60A + 90 = 10(A + 3)? = 0. It has the double root —3. Hence the corresponding general solution is 


t 


y=(cqy + cote? ‘ ¢ 


We also need y’ = (cg — 3c, — 3cof)e > : 
The initial conditions give y(0) = cy = 0.16, y’(0) = cy — 3c, = 0, cg = 0.48. Hence in the critical case the 
solution is 


y = (0.16 + 0.48pe7%, 


It is always positive and decreases to 0 in a monotone fashion. 

(IIT) The model now is 10y” + 10y’ + 90y = 0. Since c = 10 is smaller than the critical c, we shall get 
oscillations. The characteristic equation is 10A2 + 10A + 90 10[(A 4 3)? + 9 4] 0. It has the complex 
roots [see (4) in Sec. 2.2 with a = 1| and b = 9] 


A=-0.5 + V0.57-9=-0.5 + 2.963. 
This gives the general solution 
y = e 9 (A cos 2.96t + B sin 2.961). 
Thus y(0) = A = 0.16. We also need the derivative 


y’ = e °5*(—0.5A cos 2.96t — 0.5B sin 2.96 — 2.96A sin 2.96t + 2.96B cos 2.961). 


Hence y’(0) 0.5A + 2.968 = 0, B = 0.5A/2.96 = 0.027. This gives the solution 


y = e 9"(0.16 cos 2.96t + 0.027 sin 2.961) = 0.162e 9" cos (2.96t — 0.17). 


We see that these damped oscillations have a smaller frequency than the harmonic oscillations in Example | by 
about 1% (since 2.96 is smaller than 3.00 by about 1%). Their amplitude goes to zero. See Fig. 40. B 


» 
0.15 


Fig. 40. The three solutions in Example 2 


This section concerned free motions of mass—spring systems. Their models are homo- 
geneous linear ODEs. Nonhomogeneous linear ODEs will arise as models of forced 
motions, that is, motions under the influence of a “driving force.” We shall study them 
in Sec. 2.8, after we have learned how to solve those ODEs. 
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PROBLEM SET 2-4 


1-10 


HARMONIC OSCILLATIONS The cylindrical buoy of diameter 60 cm in Fig. 43 is 


(UNDAMPED MOTION) 


. Initial value problem. Find the harmonic motion (4) 
that starts from yo with initial velocity vo. Graph or 
sketch the solutions for w) = 77, yo = 1, and various 
Ug of your choice on common axes. At what t-values 
do all these curves intersect? Why? 

. Frequency. Ifa weight of 20 nt (about 4.5 Ib) stretches 
a certain spring by 2 cm, what will the frequency of the 
corresponding harmonic oscillation be? The period? 

. Frequency. How does the frequency of the harmonic 
oscillation change if we (i) double the mass, (ii) take 
a spring of twice the modulus? First find qualitative 


floating in water with its axis vertical. When depressed 
downward in the water and released, it vibrates with 
period 2 sec. What is its weight? 


Fig. 43. Buoy (Problem 8) 


: 9. Vibration of water in a tube. If 1 liter of water (about 
answers by physics, then look at formulas. 1.06 US quart) is vibrating up and down under the 
. Initial velocity. Could you make a harmonic oscillation influence of gravitation in a U-shaped tube of diameter 
move faster by giving the body a greater initial push? 2 cm (Fig. 44), what is the frequency? Neglect friction. 
. Springs in parallel. What are the frequencies of First guess. 
vibration of a body of mass m = 5 kg (i) on a spring 
of modulus k, = 20 nt/m, (ii) on a spring of modulus 
ko = 45 nt/m, (iii) on the two springs in parallel? See 
Fig. 41. 
Fig. 44. Tube (Problem 9) 
10. TEAM PROJECT. Harmonic Motions of Similar 


Fig. 41. Parallel springs (Problem 5) 

. Spring in series. If a body hangs on a spring s; of 
modulus k, = 8, which in turn hangs on a spring so 
of modulus ky = 12, what is the modulus k of this 
combination of springs? 

. Pendulum. Find the frequency of oscillation of a 
pendulum of length L (Fig. 42), neglecting air 
resistance and the weight of the rod, and assuming 0 
to be so small that sin 0 practically equals 6. 


Body of 
mass m 


Fig. 42. Pendulum (Problem 7) 


8. Archimedian principle. This principle states that the 


buoyancy force equals the weight of the water 
displaced by the body (partly or totally submerged). 


Models. The unifying power of mathematical meth- 
ods results to a large extent from the fact that different 
physical (or other) systems may have the same or very 
similar models. Illustrate this for the following three 
systems 

(a) Pendulum clock. A clock has a 1-meter pendulum. 
The clock ticks once for each time the pendulum 
completes a full swing, returning to its original position. 
How many times a minute does the clock tick? 


(b) Flat spring (Fig. 45). The harmonic oscillations 
of a flat spring with a body attached at one end and 
horizontally clamped at the other are also governed by 
(3). Find its motions, assuming that the body weighs 
8 nt (about 1.8 Ib), the system has its static equilibrium 
1 cm below the horizontal line, and we let it start from 
this position with initial velocity 10 cm/sec. 


ee | 


» 


Fig. 45. Flat spring 


70 


CHAP. 2. Second-Order Linear ODEs 


(c) Torsional vibrations (Fig. 46). Undamped 
torsional vibrations (rotations back and forth) of a 
wheel attached to an elastic thin rod or wire are 
governed by the equation 99” + K@ = 0, where 0 
is the angle measured from the state of equilibrium. 
Solve this equation for K/Ig = 13.69 sec”, initial 
angle 30°(= 0.5235 rad) and initial angular velocity 
20° sec! (= 0.349 rad - sec~}). 


Fig. 46. Torsional vibrations 


11-20 


DAMPED MOTION 


11. 


12. 


13. 


14. 


15. 


16. 


17. 


18. 


Overdamping. Show that for (7) to satisfy initial condi- 
tions y(O) = yo and v(0) = vg we must have cy = 
[1 + a/B)yo + Vo/Bl/2 and cz = [11 — a/B)yo — 
Vo/B)/2. 


Overdamping. Show that in the overdamped case, the 
body can pass through y = 0 at most once (Fig. 37). 


Initial value problem. Find the critical motion (8) 
that starts from yo with initial velocity vo. Graph 
solution curves for a = 1, yo = 1 and several vg such 
that (i) the curve does not intersect the f-axis, (ii) it 
intersects it at tf = 1, 2,...,5, respectively. 


Shock absorber. What is the smallest value of the 
damping constant of a shock absorber in the suspen- 
sion of a wheel of a car (consisting of a spring and an 
absorber) that will provide (theoretically) an oscillation- 
free ride if the mass of the car is 2000 kg and the spring 
constant equals 4500 kg/ sec”? 


Frequency. Find an approximation formula for w* in 
terms of wo by applying the binomial theorem in (9) 
and retaining only the first two terms. How good is the 
approximation in Example 2, III? 


Maxima. Show that the maxima of an underdamped 
motion occur at equidistant t-values and find the 
distance. 


Underdamping. Determine the values of ¢ corre- 
sponding to the maxima and minima of the oscillation 
y(t) = e~' sin t. Check your result by graphing y(2). 


Logarithmic decrement. Show that the ratio of 
two consecutive maximum amplitudes of a damped 
oscillation (10) is constant, and the natural logarithm 
of this ratio called the logarithmic decrement, 


19. 


20. 


equals A = 27ra/w*. Find A for the solutions of 
y” + 2y' + 5y =0. 

Damping constant. Consider an underdamped motion 
of a body of mass m = 0.5 kg. If the time between two 
consecutive maxima is 3 sec and the maximum 
amplitude decreases to 3 its initial value after 10 cycles, 
what is the damping constant of the system? 


CAS PROJECT. Transition Between Cases I, II, 
III. Study this transition in terms of graphs of typical 
solutions. (Cf. Fig. 47.) 


(a) Avoiding unnecessary generality is part of good 
modeling. Show that the initial value problems (A) 
and (B), 

(A) y tey +y=0, yO=1, y'=0 
(B) the same with different c and y’(0) = —2 (instead 
of 0), will give practically as much information as a 
problem with other m, k, y(O), y’(0). 

(b) Consider (A). Choose suitable values of c, 
perhaps better ones than in Fig. 47, for the transition 
from Case III to II and I. Guess c for the curves in the 
figure. 

(c) Time to go to rest. Theoretically, this time is 
infinite (why?). Practically, the system is at rest when 
its motion has become very small, say, less than 0.1% 
of the initial displacement (this choice being up to us), 
that is in our case, 


(11) |y()| < 0.001 for all ¢ greater than some 14. 


In engineering constructions, damping can often be 
varied without too much trouble. Experimenting with 
your graphs, find empirically a relation between fy, 
and c. 

(d) Solve (A) analytically. Give a reason why the 
solution c of y(t2) = —0.001, with tg the solution of 
y’(t) = 0, will give you the best possible c satisfying 
(11). 

(e) Consider (B) empirically as in (a) and (b). What 
is the main difference between (B) and (A)? 
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2.5 Euler-Cauchy Equations 


EXAMPLE-1 


Euler—Cauchy equations* are ODEs of the form 
(1) aye + axy’ + by =0 


with given constants a and b and unknown function y(x). We substitute 


m ’ m-1 ”" m—-2 


y=x', y =mx ; y =m(m — 1)x 
into (1). This gives 
x2m(m _ Ix”? + axmx™~1 + bx™ = 0 


and we now see that y = x”” was a rather natural choice because we have obtained a com- 
mon factor x”. Dropping it, we have the auxiliary equation m(m — 1) + am + b = O or 


(2) m2 + (a- 1)m+b=0. (Note: a — 1, not a.) 


Hence y = x” is a solution of (1) if and only if m is a root of (2). The roots of (2) are 


3) my =5(1-a)+ V40_—a?-—b, = mg=s1-—a)—- Vad —-a?- 2b. 


Case I. Real different roots m , and mg give two real solutions 
yy(x) = x™ and yo(x) = x". 


These are linearly independent since their quotient is not constant. Hence they constitute 
a basis of solutions of (1) for all x for which they are real. The corresponding general 
solution for all these x is 


(4) Bei 63 ier aa > le (c,, Cy arbitrary). 


General Solution in the Case of Different Real Roots 


The Euler—Cauchy equation xe" + 1.5xy’ — 0.5y = 0 has the auxiliary equation m= + 0.5m — 0.5 = 0. The 
roots are 0.5 and —1. Hence a basis of solutions for all positive x is yy = x9 and yg = 1/x and gives the general 
solution 


c 
y=oVx + = (x > 0). Ba 


4LEONHARD EULER (1707-1783) was an enormously creative Swiss mathematician. He made 
fundamental contributions to almost all branches of mathematics and its application to physics. His important 
books on algebra and calculus contain numerous basic results of his own research. The great French 
mathematician AUGUSTIN LOUIS CAUCHY (1789-1857) is the father of modern analysis. He is the creator 
of complex analysis and had great influence on ODEs, PDEs, infinite series, elasticity theory, and optics. 
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EXAMPLE 2 


EXAMPLE 3 


Yi 


CHAP. 2. Second-Order Linear ODEs 


Case II. A real double root m, = 4(1 — a) occurs if and only if b = a(a — 1) because 


then (2) becomes [m + 3(a = br, as can be readily verified. Then a solution is 
= x9-®/2. and (1) is of the form 


1 - ay 
(5) x2y” + axy’ + 41 —a)*y = 0 or y" + ey! +! ri 4) ; 
x x 


A second linearly independent solution can be obtained by the method of reduction of 
order from Sec. 2.1, as follows. Starting from yg = uy,, we obtain for u the expression 
(9) Sec. 2.1, namely, 


u= [ud where u = Se(-[rac). 
yi 


From (5) in standard form (second ODE) we see that p = a/x (not ax; this is essential!). 
Hence exp {(—p dx) = exp (—a In x) = exp (Inx~%) = 1/x. Division by i =a 
gives U = 1/x, so that u = In x by integration. Thus, ye = uy, = y1 In x, and y, and yo 
are linearly independent since their quotient is not constant. The general solution 
corresponding to this basis is 


(6) y = (cy + cg in x) x™, m= a(1 — a). 


General Solution in the Case of a Double Root 


vv 


The Euler-Cauchy equation xy" — 5xy’ + 9y = 0 has the auxiliary equation m” — 6m + 9 = 0. It has the 
double root m = 3, so that a general solution for all positive x is 


y = (cy + cg In x) x, fai] 


Case III. Complex conjugate roots are of minor practical importance, and we discuss 
the derivation of real solutions from complex ones just in terms of a typical example. 


Real General Solution in the Case of Complex Roots 


The Euler—Cauchy equation x?y” + 0.6xy’ + 16.04y = 0 has the auxiliary equation m@ — 0.4m + 16.04 = 0. 
The roots are complex conjugate, m , = 0.2 + 4i and mg = 0.2 — 4i, where i = V—1. We now use the trick 


of writing x = e!™* and obtain 
xm = 02+ 4 = x0 2(eln aya = 0-204 In a 
xm = 02-4 = x0 2eln Bat = 020-4 In xyi 


Next we apply Euler’s formula (11) in Sec. 2.2 with t = 4 In x to these two formulas. This gives 


x™ = x°2Fcos (4 In x) + isin (4 In x)], 

x™ = x92Fcos (4 In x) — isin (4 In x)]. 
We now add these two formulas, so that the sine drops out, and divide the result by 2. Then we subtract the 
second formula from the first, so that the cosine drops out, and divide the result by 27. This yields 


x°? cos (4 In x) and x°? sin (4 In x) 


respectively. By the superposition principle in Sec. 2.2 these are solutions of the Euler-Cauchy equation (1). 
Since their quotient cot (4 In x) is not constant, they are linearly independent. Hence they form a basis of solutions, 
and the corresponding real general solution for all positive x is 


(8) y= x°IA cos (4Inx) + B sin (4 Inx)]. 
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EXAMPLE 4 


Figure 48 shows typical solution curves in the three cases discussed, in particular the real basis functions in 
Examples | and 3. | 


y xInx * 
rae x0 Inx 1 i x02 sin (4 In x) 
: 279-5 In x 
0.5- 
aK fi [«t nx 
x x 
ee 114 2 
-1.0- 
1.5 -1.5b x92 cos (4 In x) 


Case I: Real roots Case II: Double root Case III: Complex roots 


Fig. 48. Euler—Cauchy equations 


Boundary Value Problem. Electric Potential Field Between Two Concentric Spheres 


Find the electrostatic potential v = u(r) between two concentric spheres of radii ry = 5 cm and rg = 10 cm 
kept at potentials v; = 110 V and veg = 0, respectively. 
Physical Information. v(r) is a solution of the Euler—Cauchy equation rv” + 2v’ = 0, where v’ = du/dr. 


Solution. The auxiliary equation is m” + m = 0. It has the roots 0 and —1. This gives the general solution 


u(r) = cy + C2/r. From the “boundary conditions” (the potentials on the spheres) we obtain 


c2 c2 
aS) =e to = 0, v0) = «1 + 55 = 0 


By subtraction, c2/10 = 110,cp = 1100. From the second equation, cy = —co/10 = —110. Answer: 


u(r) = —110 + 1100/r V. Figure 49 shows that the potential is not a straight line, as it would be for a potential 
between two parallel plates. For example, on the sphere of radius 7.5 cm it is not 110/2 = 55 V, but considerably 
less. (What is it?) a 


| 
5 6 7 8 9 10 r 


Fig. 49. Potential v(r) in Example 4 


PROBLEM SET 2°75 


1. Double root. Verify directly by substitution that 
x4-®/? In x is a solution of (1) if (2) has a double root, 
but x” In x and x”? In x are not solutions of (1) if the 
roots m, and mg of (2) are different. 


2-11| GENERAL SOLUTION 


4. xy” + 2y' =0 

5. 4x?y" + 5y =0 

6. x2y" + 0.7xy’ — 0.ly = 0 
7. (x2D? — 4xD + 6Dy = C 
8. (x2D? — 3xD + 4Dy = 0 


Find a real general solution. Show the details of your work. 9. (x2D? — 0.2xD + 0.36Dy = 0 


2. x2y” — 20y =0 


10. (x2D? — xD + 5Dy =0 


3. 5x7y" + 23xy’ + 16.2y = 0 11. (2D? — 3xD + 10Dy = 0 
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12-19| INITIAL VALUE PROBLEM 20. TEAM PROJECT. Double Root 
Solve and graph the solution. Show the details of your work. (a) Derive a second linearly independent solution of 


(1) by reduction of order; but instead of using (9), Sec. 


20 £ _ oma ! — 
12. x"y — 4xy + by=0, y= 04, y(1)=0 2.1, perform all steps directly for the present ODE (1). 


13. x2y"” + 3xy’ + 0.75y = 0, y(1) = 1, (b) Obtain x”’Inx by considering the solutions «”" 
yi) =-15 and x” ** of a suitable Euler-Cauchy equation and 
14, x?y" +2y' + 9y=0, yA) =0, yA) = 25 ene 
c) Verify by substitution that x”” In x,m = (1 — a)/2, 
15. x2y" " axy! eg: wag y'Q) ak (c) Verify by substitution that.x°" In x, m = ( a)/ 


is a solution in the critical case. 

16. (x7D? —3xD + 4Iy=0, yd) =—a, y'(l) = 20 (d) Transform the Euler-Cauchy equation (1) into 

17. (2D? +xD+Dy=0, yl)=1, y()=1 al ce with constant coefficients by setting 

x=e(x> 0). 

(e) Obtain a second linearly independent solution of 

19. (2D? — xD — 15I)y =0, yl) = 0.1, the Euler—Cauchy equation in the “critical case” from 
y'(1) = -4.5 that of a constant-coefficient ODE. 


18. (9x7D? + 3xD + Dy =0, y)=1, yA) =0 


2.6 Existence and Uniqueness 
of Solutions. Wronskian 


In this section we shall discuss the general theory of homogeneous linear ODEs 
(1) y" + pay’ + q@y = 0 


with continuous, but otherwise arbitrary, variable coefficients p and q. This will concern 
the existence and form of a general solution of (1) as well as the uniqueness of the solution 
of initial value problems consisting of such an ODE and two initial conditions 


(2) y(xo) = Ko, y' (xo) = Ky 


with given xo, Ko, and Ky. 
The two main results will be Theorem 1, stating that such an initial value problem 
always has a solution which is unique, and Theorem 4, stating that a general solution 


(3) y = cyy1 + coye (cy, Cz arbitrary) 


includes all solutions. Hence linear ODEs with continuous coefficients have no “singular 
solutions” (solutions not obtainable from a general solution). 

Clearly, no such theory was needed for constant-coefficient or Euler-Cauchy equations 
because everything resulted explicitly from our calculations. 

Central to our present discussion is the following theorem. 


THEOREM 1 Existence and Uniqueness Theorem for Initial Value Problems 


If p(x) and q(x) are continuous functions on some open interval I (see Sec. 1.1) and 
Xo is in I, then the initial value problem consisting of (1) and (2) has a unique 
solution y(x) on the interval I. 
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THEOREM 2 


PROOF 


The proof of existence uses the same prerequisites as the existence proof in Sec. 1.7 
and will not be presented here; it can be found in Ref. [A11] listed in App. 1. Uniqueness 
proofs are usually simpler than existence proofs. But for Theorem 1, even the uniqueness 
proof is long, and we give it as an additional proof in App. 4. 


Linear Independence of Solutions 


Remember from Sec. 2.1 that a general solution on an open interval J is made up from a 
basis yj, ye on /, that is, from a pair of linearly independent solutions on J. Here we call 
y1, y2 linearly independent on / if the equation 


(4) kyyi(x) + koyo(x) = 0 onl implies ky =0, ko =0. 
We call yj, yo linearly dependent on / if this equation also holds for constants ky, kg 


not both 0. In this case, and only in this case, y; and yg are proportional on J, that is (see 
Sec. 2.1), 


(5) (a) yy = kyo or (b) yo = ly for all on J. 


For our discussion the following criterion of linear independence and dependence of 
solutions will be helpful. 


Linear Dependence and Independence of Solutions 


Let the ODE (1) have continuous coefficients p(x) and q(x) on an open interval I. 
Then two solutions y and yz of (1) on I are linearly dependent on I if and only if 
their “Wronskian”’ 


(6) W01, Ya) = yiy2 — Yayi 


is 0 at some Xo in I. Furthermore, if W = 0 at an x = x9 in I, then W=0 on T; 
hence, if there is an x, in I at which W is not 0, then yy, yo are linearly independent 
on I. 


(a) Let y, and yo be linearly dependent on J. Then (5a) or (5b) holds on J. If (5a) holds, 
then 


W(y1, ya) = yiy2 — yay = kyaye — yekye = 0. 


Similarly if (Sb) holds. 

(b) Conversely, we let W(y1, yo) = 0 for some x = xg and show that this implies linear 
dependence of y; and ys on J. We consider the linear system of equations in the unknowns 
k 1> ko 


kyy1(%0) + Keye(xo) = 0 


(7) 
kyy1(X0) + keyo(xo) = 0. 
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To eliminate ky, multiply the first equation by yg and the second by —yg and add the 
resulting equations. This gives 


kiyi(xo)ya(xo) — kiyito)ya(xo) = k1W(y1(%0), ya(xo)) = 0. 


Similarly, to eliminate k,, multiply the first equation by —y1 and the second by y, and 
add the resulting equations. This gives 


kyW(y1(%o), Y2(Xo)) = 0. 


If W were not 0 at x9, we could divide by W and conclude that ky = kg = 0. Since W is 
0, division is not possible, and the system has a solution for which ky and kg are not both 
0. Using these numbers ky, ka, we introduce the function 


y(x) = kyy1@) + keyo(x). 


Since (1) is homogeneous linear, Fundamental Theorem | in Sec. 2.1 (the superposition 
principle) implies that this function is a solution of (1) on J. From (7) we see that it satisfies 
the initial conditions y(x9) = 0, y' (x9) = 0. Now another solution of (1) satisfying the 
same initial conditions is y* = 0. Since the coefficients p and q of (1) are continuous, 
Theorem | applies and gives uniqueness, that is, y = y*, written out 


kyy1 ae Koyo =0 on I. 


Now since k, and kg are not both zero, this means linear dependence of y1, ye on J. 

(c) We prove the last statement of the theorem. If W(xo) = 0 at an xo in J, we have 
linear dependence of yj, yg on J by part (b), hence W = 0 by part (a) of this proof. Hence 
in the case of linear dependence it cannot happen that W(x,) # 0 at an x, in J. If it does 
happen, it thus implies linear independence as claimed. o 


For calculations, the following formulas are often simpler than (6). 


(6%) W092) = (@) (2) y2 (140) orb) (2) 93 (v2 ¥ 0). 


These formulas follow from the quotient rule of differentiation. 


Remark. Determinants. Students familiar with second-order determinants may have 
noticed that 


yl 2 
Uy , 
1 y2 


Wy1, y2) = = y1yo — yoy. 


This determinant is called the Wronski determinant? or, briefly, the Wronskian, of two 
solutions y; and yg of (1), as has already been mentioned in (6). Note that its four entries 
occupy the same positions as in the linear system (7). 


Introduced by WRONSKI (JOSEF MARIA HONE, 1776-1853), Polish mathematician. 
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EXAMPLE 1 


EXAMPLE 2 


THEOREM 3 


PROOF 


Illustration of Theorem 2 
The functions yy = cos wx and yg = sin wx are solutions of y” + wy = 0. Their Wronskian is 
COS Wx sin wx 


: ! f : 
W(cos wx, sin wx) = . = yyy2 — yoy, = w cos” wx + w sin? wx = ow. 
—@ SiN wx Ww COS Wx 


Theorem 2 shows that these solutions are linearly independent if and only if w # 0. Of course, we can see 
this directly from the quotient yo/y; = tan wx. For w = 0 we have yg = 0, which implies linear dependence 


(why?). 3] 


Illustration of Theorem 2 for a Double Root 


A general solution of y’ — 2y’ + y = 0 on any interval is y = (cy + cox)e”. (Verify!). The corresponding 
Wronskian is not 0, which shows linear independence of e* and xe” on any interval. Namely, 


W(x, xe”) = (x + 1)e2” — xe?* = 0?” # 0, a 


e” (x + le” 


A General Solution of (1) Includes All Solutions 


This will be our second main result, as announced at the beginning. Let us start with 
existence. 


Existence of a General Solution 


If p(x) and q(x) are continuous on an open interval I, then (1) has a general solution 
on I. 


By Theorem 1, the ODE (1) has a solution y;(x) on J satisfying the initial conditions 


yilto) = 1, yi(xo) = 0 
and a solution yoa(x) on J satisfying the initial conditions 
ya(xo) = 0, — ya(xo) = I. 
The Wronskian of these two solutions has at x = xo the value 
W(y1(0), yo) = yivo)y2(x0) — ya(xoy1(%o) = 1. 
Hence, by Theorem 2, these solutions are linearly independent on J. They form a basis of 


solutions of (1) on J, and y = czy, + cgy2 with arbitrary ci, cg is a general solution of (1) 
on J, whose existence we wanted to prove. |_| 
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THEOREM 4 


PROOF 


CHAP. 2. Second-Order Linear ODEs 


We finally show that a general solution is as general as it can possibly be. 


A General Solution Includes All Solutions 


If the ODE (1) has continuous coefficients p(x) and q(x) on some open interval TI, 
then every solution y = Y(x) of (1) on I is of the form 


(8) Y(x) = Ciyi(x) + Coya(x) 
where yi, yg is any basis of solutions of (1) on I and Cy, C2 are suitable constants. 


Hence (1) does not have singular solutions (that is, solutions not obtainable from 
a general solution). 


Let y = Y(x) be any solution of (1) on J. Now, by Theorem 3 the ODE (1) has a general 
solution 


(9) y(x) = cyyy(X) + coya(x) 


on I. We have to find suitable values of c,, cp such that y(x) = Y(x) on I. We choose any 
Xg in J and show first that we can find values of cy, cp such that we reach agreement at 
Xo, that is, y(x9) = Y(%o) and y' (xo) ay (xg). Written out in terms of (9), this becomes 


(a) cyyi%o) + Coya(xo) = Y(Xo) 


(b) ciyi(xo) + caya(xo) = Y' (xo). 


(10) 


We determine the unknowns cy and cg. To eliminate co, we multiply (10a) by ya(x o) and 
(10b) by —ye(xo) and add the resulting equations. This gives an equation for cy. Then we 
multiply (10a) by —y1(xo) and (10b) by y1(xo) and add the resulting equations. This gives 
an equation for cp. These new equations are as follows, where we take the values of 
V1 Y1s 2, Yas Y, Y" at xo. 


ex(yiy2 — yayi) = c1W(y1, y2) = Yya — ya" 
Co(Viy2 — Yayi) = coW(y1, Ye) = yiY! — Yyt. 
Since y1, yg is a basis, the Wronskian W in these equations is not 0, and we can solve for 


c, and cy. We call the (unique) solution cy = Cy, co = Co. By substituting it into (9) we 
obtain from (9) the particular solution 


y*(x) = Cyy1@) + Coyo(). 
Now since Cj, Co is a solution of (10), we see from (10) that 
y*(xo) = Yxo), —-y*"(xo) = Y' (xo). 


From the uniqueness stated in Theorem | this implies that y* and Y must be equal 
everywhere on J, and the proof is complete. |_| 


SEC. 2.7. Nonhomogeneous ODEs 


1. 
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Reflecting on this section, we note that homogeneous linear ODEs with continuous variable 
coefficients have a conceptually and structurally rather transparent existence and uniqueness 
theory of solutions. Important in itself, this theory will also provide the foundation for our 
study of nonhomogeneous linear ODEs, whose theory and engineering applications form 
the content of the remaining four sections of this chapter. 


PROBLEEM—SET 2-6 


Derive (6*) from (6). 


2-8| BASIS OF SOLUTIONS. WRONSKIAN 


Find the Wronskian. Show linear independence by using 
quotients and confirm it by Theorem 2. 


2. e 
. oe. eo 2:8x 


4.0% eo h5% 


> 


~ x, 1/x 


3 2 


Fae een 6 


3 
4 
5 
6. 
7 
8 


e” cos wx, e” sin wx 


« cosh ax, sinh ax 


k 


. x” cos (In x), x sin (in x) 


9-15| ODE FOR GIVEN BASIS. WRONSKIAN. IVP 


(a) Find a second-order homogeneous linear ODE for 
which the given functions are solutions. (b) Show linear 
independence by the Wronskian. (c) Solve the initial value 
problem. 


9, 
10. 
11. 


12. 
13. 
14. 


15. 


cos 5x, sin 5x, (0) =3, y'(0) = —-5 
x™,x™, (1) = 2, y'(1) = 2m, — 4a 


e 7 cos 0.3x, e772” sin 0.3x, (0) = 3, 
y'(0) = -7.5 
x,x7Inx, yI=4 y)=6 


le, yO)=1, y'(0)=-1 


e cos ax, e * sin 77x, y(0) = 1, 
y’(0) =—-k-T7 
cosh 1.8x, sinh 1.8x, (0) = 14.20, y'(0) = 16.38 


16. TEAM PROJECT. Consequences of the Present 


Theory. This concerns some noteworthy general 
properties of solutions. Assume that the coefficients p 
and q of the ODE (1) are continuous on some open 
interval J, to which the subsequent statements refer. 
(a) Solve y” — y = 0 (a) by exponential functions, 
(b) by hyperbolic functions. How are the constants in 
the corresponding general solutions related? 


(b) Prove that the solutions of a basis cannot be 0 at 
the same point. 

(c) Prove that the solutions of a basis cannot have a 
maximum or minimum at the same point. 

(d) Why is it likely that formulas of the form (6*) 
should exist? 

(e) Sketch yy(x) =x? if x20 and O if x <0, 
yo(x) = 0 if x = 0 and x? if x <0. Show linear 
independence on —1<x< 1. What is_ their 
Wronskian? What Euler-Cauchy equation do yj, ye 
satisfy? Is there a contradiction to Theorem 2? 


(f) Prove Abel’s formula® 


x 


Woy1(x), ya(x)) = € exp | - | ple) a 


Xo 


where c = W(y1(Xo), yo(Xo)). Apply it to Prob. 6. Hint: 
Write (1) for y; and for yo. Eliminate q algebraically 
from these two ODEs, obtaining a first-order linear 
ODE. Solve it. 


2./ Nonhomogeneous ODEs 


We now advance from homogeneous to nonhomogeneous linear ODEs. 
Consider the second-order nonhomogeneous linear ODE 


(1) y" + p@y’ + q@y = rx) 


where r(x) # 0. We shall see that a “general solution” of (1) is the sum of a general 
solution of the corresponding homogeneous ODE 


5NIELS HENRIK ABEL (1802-1829), Norwegian mathematician. 
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(2) y" + pay’ + gay = 0 


and a “particular solution” of (1). These two new terms “general solution of (1)” and 
“particular solution of (1)” are defined as follows. 


DEFINITION General Solution, Particular Solution 


A general solution of the nonhomogeneous ODE (1) on an open interval / is a 
solution of the form 


(3) W(x) = yr(x) + yp(x); 


here, y, = c1y1 + Ceye is a general solution of the homogeneous ODE (2) on J and 
Yp is any solution of (1) on J containing no arbitrary constants. 

A particular solution of (1) on / is a solution obtained from (3) by assigning 
specific values to the arbitrary constants cy and cz in yp. 


Our task is now twofold, first to justify these definitions and then to develop a method 
for finding a solution y, of (1). 

Accordingly, we first show that a general solution as just defined satisfies (1) and that 
the solutions of (1) and (2) are related in a very simple way. 


THEOREM 1 Relations of Solutions of (1) to Those of (2) 


(a) The sum of a solution y of (1) on some open interval I and a solution ¥ of 
(2) on Tis a solution of (1) on I. In particular, (3) is a solution of (1) on I. 


(b) The difference of two solutions of (1) on I is a solution of (2) on I. 


PROOF (a) Let L[y] denote the left side of (1). Then for any solutions y of (1) and ¥ of (2) on J, 


Ly + 39] = Ly] + Ly) H=rt+0=,r. 


(b) For any solutions y and y* of (1) on J we have Liy — y*] = L[y] — Ll y*] = 
r-r=0. |_| 


Now for homogeneous ODEs (2) we know that general solutions include all solutions. 
We show that the same is true for nonhomogeneous ODEs (1). 


THEOREM 2 A General Solution of a Nonhomogeneous ODE Includes All Solutions 


If the coefficients p(x), q(x), and the function r(x) in (1) are continuous on some 
open interval [, then every solution of (1) on I is obtained by assigning suitable 
values to the arbitrary constants cy and C2, in a general solution (3) of (1) on I. 


PROOF Let y* be any solution of (1) on J and x9 any x in J. Let (3) be any general solution of 
(1) on /. This solution exists. Indeed, y;, = cyy1 + Coy exists by Theorem 3 in Sec. 2.6 
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because of the continuity assumption, and y, exists according to a construction to be 
shown in Sec. 2.10. Now, by Theorem 1(b) just proved, the difference Y = y* — y, isa 
solution of (2) on J. At xg we have 


¥(xo) = y*(x0) — Yp(to0). —*Y' (x0) = y*" (xo) — yp(2o). 


Theorem | in Sec. 2.6 implies that for these conditions, as for any other initial conditions 
in J, there exists a unique particular solution of (2) obtained by assigning suitable values 
to cy, C2 in yp. From this and y* = Y + yp the statement follows. a 


Method of Undetermined Coefficients 


Our discussion suggests the following. To solve the nonhomogeneous ODE (1) or an initial 
value problem for (1), we have to solve the homogeneous ODE (2) and find any solution 
Yp Of (1), so that we obtain a general solution (3) of (1). 

How can we find a solution y, of (1)? One method is the so-called method of 
undetermined coefficients. It is much simpler than another, more general, method (given 
in Sec. 2.10). Since it applies to models of vibrational systems and electric circuits to be 
shown in the next two sections, it is frequently used in engineering. 

More precisely, the method of undetermined coefficients is suitable for linear ODEs 
with constant coefficients a and b 


(4) a + ay’ + by = r(x) 


when r(x) is an exponential function, a power of x, a cosine or sine, or sums or products 
of such functions. These functions have derivatives similar to r(x) itself. This gives the 
idea. We choose a form for y, similar to r(x), but with unknown coefficients to be 
determined by substituting that y, and its derivatives into the ODE. Table 2.1 on p. 82 
shows the choice of yp for practically important forms of r(x). Corresponding rules are 
as follows. 


Choice Rules for the Method of Undetermined Coefficients 


(a) Basic Rule. Jf r(x) in (4) is one of the functions in the first column in 
Table 2.1, choose yy in the same line and determine its undetermined 
coefficients by substituting y, and its derivatives into (4). 


(b) Modification Rule. /f a term in your choice for y, happens to be a 
solution of the homogeneous ODE corresponding to (4), multiply this term 
by x (or by x if this solution corresponds to a double root of the 
characteristic equation of the homogeneous ODE). 


(c) Sum Rule. /f r(x) is a sum of functions in the first column of Table 2.1, 
choose for yp the sum of the functions in the corresponding lines of the 
second column. 


The Basic Rule applies when r(x) is a single term. The Modification Rule helps in the 
indicated case, and to recognize such a case, we have to solve the homogeneous ODE 
first. The Sum Rule follows by noting that the sum of two solutions of (1) with r = ry 
and r = rs (and the same left side!) is a solution of (1) with r = ry + ro. (Verify!) 
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The method is self-correcting. A false choice for y, or one with too few terms will lead 
to acontradiction. A choice with too many terms will give a correct result, with superfluous 
coefficients coming out zero. 

Let us illustrate Rules (a)—(c) by the typical Examples 1-3. 


Table 2.1 Method of Undetermined Coefficients 


Term in r(x) Choice for y,(x) 
ke” Ce” 
kx" (n =0,1,°°°) |) Kynx"™ + Kyiux™ 1 +++) + Kyx + Ko 
k cos wx 


: K cos wx + M sin wx 
k sin wx 


ke®” cos wx ; 
e°*(K cos wx + M sin wx) 
ke®” sin wx 


Application of the Basic Rule (a) 

Solve the initial value problem 

(5) y” +y=0.001x7, y0)=0, y(O)=15. 

Solution. Step 1. General solution of the homogeneous ODE. The ODEy” + y = Ohas the general solution 
Yn = Acosx + Bsinx. 


Step 2. Solution y, of the nonhomogeneous ODE. We first try yp = Kx", Then Yp = 2K. By substitution, 
2K + Kx? = 0.001x?. For this to hold for all x, the coefficient of each power of x (x? and x°) must be the same 
on both sides; thus K = 0.001 and 2K = 0, a contradiction. 

The second line in Table 2.1 suggests the choice 


Yp = Kox? + Kix + Ko. Then yy + yp = 2Ka + Kox® + Kyx + Ko = 0.001x”. 


Equating the coefficients of x2, Ry x° on both sides, we have Ky = 0.001, Ky = 0,2Ka + Ko = 0. Hence 
Ko = —2Kz = —0.002. This gives y, = 0.001x? — 0.002, and 


Y=JYn+t+ Yp =Acosx + Bsinx + 0.001x2 — 0.002. 


Step 3. Solution of the initial value problem. Setting x = 0 and using the first initial condition gives 
y(0) = A — 0.002 = 0, hence A = 0.002. By differentiation and from the second initial condition, 


! 


y y, J Vp A sinx + Bcos x + 0.002x and y'(0) =B=15. 


This gives the answer (Fig. 50) 


y = 0.002 cos x + 1.5 sinx + 0.001x2 — 0.002. 


Figure 50 shows y as well as the quadratic parabola y, about which y is oscillating, practically like a sine curve 
since the cosine term is smaller by a factor of about 1/1000. a 


Fig. 50. Solution in Example 1 
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EXAMPLE 2 


EXAMPLE 3 


Application of the Modification Rule (b) 


Solve the initial value problem 
(6) y" + 3y’ + 2.25y = -10e*, (0) = 1, -y'(0) = O. 


Solution. Step 1. General solution of the homogeneous ODE. The characteristic equation of the homogeneous 
ODE is A? + 3A + 2.25 = (A + 1.5)” = 0. Hence the homogeneous ODE has the general solution 


yp = (cy + coxje7h**, 


Step 2. Solution y, of the nonhomogeneous ODE. The function e 1 on the right would normally require 
the choice Ce~t*”. But we see from yp that this function is a solution of the homogeneous ODE, which 
corresponds to a double root of the characteristic equation. Hence, according to the Modification Rule we have 
to multiply our choice function by x". That is, we choose 


Yp = Cx2e71 57, Then Vp = C(2x — 1.5x7)e71 5%, Vp = C2 — 3x — 3x + 2.25x7)e— 152, 


We substitute these expressions into the given ODE and omit the factor e 1°”. This yields 


C(2 — 6x + 2.25x7) + 3C(2x — 1.5x?) + 2.25Cx? = -10. 


Comparing the coefficients of x2, Xi x? gives 0 = 0,0 = 0,2C = —10, hence C = —5. This gives the solution 
Yp = —5x7e7 1", Hence the given ODE has the general solution 
y =yn t+ yp = (cr + conde 19" — 5x7%e71 


Step 3. Solution of the initial value problem. Setting x = 0 in y and using the first initial condition, we obtain 
y(0) = cy = 1. Differentiation of y gives 


y’ = (co — 1.5cy — 1.5cyxye7h” — 10xe7 1” + 7.520715”, 


From this and the second initial condition we have y’(0) = cz — 1.5cy = 0. Hence cg = 1.5c, = 1.5. This gives 
the answer (Fig. 51) 


y = (1 + L5xe7 bt” = 5x27” = (1 + 15x — Sx7%)e7 1, 


The curve begins with a horizontal tangent, crosses the x-axis at x = 0.6217 (where 1 + 1.5x — 5x” = 0) and 
approaches the axis from below as x increases. 


x 
Fig. 51. Solution in Example 2 
Application of the Sum Rule (c) 
Solve the initial value problem 
(7) y" + 2y' + 0.75y = 2 cos x — 0.25 sin x + 0.09x, y(0) = 2.78, y'(0) = —0.43. 


Solution. Step 1. General solution of the homogeneous ODE. The characteristic equation of the homogeneous 
ODE is 


M2 +20 +0.75 =A +) (A438) =0 


which gives the general solution y;, = cye/? + cge7?"/. 
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Step 2. Particular solution of the nonhomogeneous ODE. We write yy = Yp1 + Yp2 and, following Table 2.1, 
(C) and (B), 


Yp1 = Kcosx + Msinx and Yp2 = Kyx + Ko. 


Differentiation gives Ypl = —Ksinx + Mcosx, Ypl = —Kcosx — Msinxand Ype = 1, Yp2 = 0. Substitution 
of yp1 into the ODE in (7) gives, by comparing the cosine and sine terms, 


K + 2M + 0.75K = 2, M — 2K + 0.75M 0.25, 
hence K = 0 and M = 1. Substituting ypz into the ODE in (7) and comparing the x- and x°-terms gives 
0.75K, = 0.09, 2K, + 0.75Ko = 0, thus K, = 0.12, Ko = —0.32. 
Hence a general solution of the ODE in (7) is 
y = ce + coe 3*/* + sin x + 0.12x — 0.32. 


Step 3. Solution of the initial value problem. From y, y’ and the initial conditions we obtain 


y(0) = cy + cg — 0.32 = 2.78, —-y'(0) = —dcy — Beg + 1 + 0.12 = —0.4. 
Hence c, = 3.1, co = 0. This gives the solution of the IVP (Fig. 52) 


y = 3.le 7? + sin x + 0.12x — 0.32. o 


a 2 
8 10 12 14 16 18 20 x 


! 
o 
a 

T 


Fig. 52. Solution in Example 3 


Stability. The following is important. If (and only if) all the roots of the characteristic 

equation of the homogeneous ODE y” + ay’ + by = 0 in(4) are negative, or have a negative 

real part, then a general solution y;, of this ODE goes to 0 as x — o, so that the “transient 

solution” y = y;, + yp of (4) approaches the “steady-state solution” y,. In this case the 

nonhomogeneous ODE and the physical or other system modeled by the ODE are called 

stable; otherwise they are called unstable. For instance, the ODE in Example | is unstable. 
Applications follow in the next two sections. 


PROBLEM SET 2-7 


1-10 | NONHOMOGENEOUS LINEAR ODEs: 2. 10y" + 50y’ + 57.6y = cos x 
GENERAL SOLUTION 3. y" + 3y! + 2y = 12x? 
Find a (real) general solution. State which rule you are 4. y” — 9y = 18 cos mx 
using. Show each step of your work. 5. y” + 4y' + 4y =e cosx 
1. y” + 5y’ + 4y = 1007 6. y" +y! + (mn? + Dy =e sin 7x 
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7. (D? + 2D + 3Dy = 3e" + $x 
8. (3D? 4 271)y = 3 cos x + cos 3x 
9. (D? — 16Dy = 9.6e* + 30e* 
10. (D? + 2D + Dy = 2x sinx 


NONHOMOGENEOUS LINEAR 
ODEs: IVPs 


11-18 


Solve the initial value problem. State which rule you are 


using. Show each step of your calculation in detail. 
11. y” + 3y = 18x7, (0) = -3, y’(0) =0 
12. y" + 4y = -12sin2x, y(0) = 1.8, y'(0) = 5.0 
13. 8y" — 6y’ + y = 6coshx, y(0) = 0.2, 
y'(0) = 0.05 
14. y” + 4y’ + 4y = e7* sin 2x, (0) = 1, 
y'(0) = -1.5 
15. (x2D? — 3xD + 31)y = 3 Inx — 4, 
yd) =0, yU)=1; yp =Inx 


16. (D? — 2D)y = 6c?" — 4e-?*,_ y(0) = —-1, y'(0) =6 


17. (D? + 0.2D + 0.26Dy = 1.2209", 
y'(0) = 0.35 


(0) = 3.5, 


18. 


19. 


20. 
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(D? + 2D + 10Dy = 17 sinx — 37 sin 3x, 
y(0) = 6.6, y'(0) = —2.2 


CAS PROJECT. Structure of Solutions of Initial 
Value Problems. Using the present method, find, 
graph, and discuss the solutions y of initial value 
problems of your own choice. Explore effects on 
solutions caused by changes of initial conditions. 
Graph y,,Y,¥ — Yp separately, to see the separate 
effects. Find a problem in which (a) the part of y 
resulting from y;,, decreases to zero, (b) increases, 
(c) is not present in the answer y. Study a problem with 
y(0) = 0, y'(0) = 0. Consider a problem in which 
you need the Modification Rule (a) for a simple root, 
(b) for a double root. Make sure that your problems 
cover all three Cases I, II, III (see Sec. 2.2). 


TEAM PROJECT. Extensions of the Method of 
Undetermined Coefficients. (a) Extend the method 
to products of the function in Table 2.1, (b) Extend 
the method to Euler-Cauchy equations. Comment on 
the practical significance of such extensions. 


2.8 Modeling: Forced Oscillations. Resonance 


In Sec. 2.4 we considered vertical motions of a mass—spring system (vibration of a mass 
m on an elastic spring, as in Figs. 33 and 53) and modeled it by the homogeneous linear 


ODE 


() 


my” + cy’ + ky =0. 


Here y(t) as a function of time ¢ is the displacement of the body of mass m from rest. 

The mass-spring system of Sec. 2.4 exhibited only free motion. This means no external 
forces (outside forces) but only internal forces controlled the motion. The internal forces 
are forces within the system. They are the force of inertia my”, the damping force cy’ 
(if c > 0), and the spring force ky, a restoring force. 


Fig. 53. 


Spring 


Mass [ro 
]| Dashpot 


Mass on a spring 


86 


CHAP. 2. Second-Order Linear ODEs 


We now extend our model by including an additional force, that is, the external force 
r(t), on the right. Then we have 


(2*) my" + cy’ + ky = r(0). 

Mechanically this means that at each instant f the resultant of the internal forces is in 
equilibrium with r(t). The resulting motion is called a forced motion with forcing function 
r(t), which is also known as input or driving force, and the solution y(t) to be obtained 
is called the output or the response of the system to the driving force. 


Of special interest are periodic external forces, and we shall consider a driving force 
of the form 


r(t) = Fo cos wt (Fo > 0, w > 0). 
Then we have the nonhomogeneous ODE 
(2) my” + cy’ + ky = Fo cos ot. 


Its solution will reveal facts that are fundamental in engineering mathematics and allow 
us to model resonance. 


Solving the Nonhomogeneous ODE (2) 


From Sec. 2.7 we know that a general solution of (2) is the sum of a general solution yy, 
of the homogeneous ODE (1) plus any solution y, of (2). To find y,, we use the method 
of undetermined coefficients (Sec. 2.7), starting from 


(3) Yp(t) = a cos wt + b sin wt. 
By differentiating this function (chain rule!) we obtain 


Yp = —@a sin wt + wb cos at, 


Yop = —w"a cos wt — wb sin wt. 
Substituting y,, Vos and Vn into (2) and collecting the cosine and the sine terms, we get 
[(k — mo”)a + wcb| cos wt + [—wca + (k — mo”)b] sin wt = Fo cos wt. 


The cosine terms on both sides must be equal, and the coefficient of the sine term 
on the left must be zero since there is no sine term on the right. This gives the two 
equations 


(k — mo*)a + a@cb = Fo 
(4) : 
—wca + (k— mw)b =0 
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for determining the unknown coefficients a and b. This is a linear system. We can solve 
it by elimination. To eliminate b, multiply the first equation by k — mo” and the second 
by —wc and add the results, obtaining 


(k — mw”)a + wc2a = Fo(k — mo”). 


Similarly, to eliminate a, multiply (the first equation by wc and the second by k — mo” 
and add to get 


wc7b + (k — maw)*b = Fowc. 


If the factor (k — ma)” + wc? is not zero, we can divide by this factor and solve for a 
and b, 


k — ma” wc 
a= Fo > b= Fo : 
(k — mo)? + wc? (k — mw)? + wc? 


If we set Vk/m = wo (> 0) as in Sec. 2.4, then k = moe, and we obtain 


2 2 
m(wo — w*) wc 
(5) a=Fo : b= Fo ; 
m2(we = One + wc? m2(we = oe + wc? 


We thus obtain the general solution of the nonhomogeneous ODE (2) in the form 
(6) y(t) = yr(t) + yp(0. 


Here yy, is a general solution of the homogeneous ODE (1) and y, is given by (3) with 
coefficients (5). 

We shall now discuss the behavior of the mechanical system, distinguishing between 
the two cases c = 0 (no damping) and c > 0 (damping). These cases will correspond to 
two basically different types of output. 


Case 1. Undamped Forced Oscillations. Resonance 


If the damping of the physical system is so small that its effect can be neglected over the 
time interval considered, we can set c = 0. Then (5) reduces to a = Fo/ [m(we = w”)] 
and b = (.. Hence (3) becomes (use Wo” = k/m) 


(7) @ r° i *° t 
y = cos wt = COS wt. 
eo mw — @”) KL — (w/o)"| 
Here we must assume that w? # wo”; physically, the frequency w/(27r) [cycles/sec] of 
the driving force is different from the natural frequency wo/(277) of the system, which is 
the frequency of the free undamped motion [see (4) in Sec. 2.4]. From (7) and from (4*) 
in Sec. 2.4 we have the general solution of the “undamped system” 


(8) y(t) = Cos (wot — 6) + Cos wt. 


m(we = w”) 
We see that this output is a superposition of two harmonic oscillations of the frequencies 
just mentioned. 
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Resonance. We discuss (7). We see that the maximum amplitude of y, is (put cos wt = 1) 
(9) ag =—p where p 


do depends on w and wo. If w — wo, then p and ag tend to infinity. This excitation of large 

oscillations by matching input and natural frequencies (w = wo) is called resonance. p is 

called the resonance factor (Fig. 54), and from (9) we see that p/k = do/Fo is the ratio 

of the amplitudes of the particular solution y, and of the input Fp cos wt. We shall see 

later in this section that resonance is of basic importance in the study of vibrating systems. 
In the case of resonance the nonhomogeneous ODE (2) becomes 


Fi 
(10) y" + wey = = COS Wot. 


Then (7) is no longer valid, and, from the Modification Rule in Sec. 2.7, we conclude that 
a particular solution of (10) is of the form 


Yp(t) = t(a cos wot + D sin wot). 


p 


Fig. 54. Resonance factor p(w) 


By substituting this into (10) we find a = 0 and b = Fo/(2ma). Hence (Fig. 55) 


(11) vob) = t sin Wof. 


Fo 
2mwo 


Fig. 55. Particular solution in the case of resonance 


We see that, because of the factor f, the amplitude of the vibration becomes larger and 
larger. Practically speaking, systems with very little damping may undergo large vibrations 
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THEOREM 1 


that can destroy the system. We shall return to this practical aspect of resonance later in 
this section. 


Beats. Another interesting and highly important type of oscillation is obtained if @ is 
close to wo. Take, for example, the particular solution [see (8)] 


F 
(12) y(t) = —5°—5- (cos wt — cos wof) (w # wo). 
MW — w 


Using (12) in App. 3.1, we may write this as 


2Fo _ [@ + @ _ [@ — @ 
y(t) = 5 3, sin t } sin t }. 
m(wo — w*) 2 2 


Since w is close to wo, the difference w 9 — w is small. Hence the period of the last sine 
function is large, and we obtain an oscillation of the type shown in Fig. 56, the dashed 
curve resulting from the first sine factor. This is what musicians are listening to when 
they tune their instruments. 
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Fig. 56. Forced undamped oscillation when the difference of the input 
and natural frequencies is small (“beats”) 


Case 2. Damped Forced Oscillations 


If the damping of the mass—spring system is not negligibly small, we have c > 0 and 
a damping term cy’ in (1) and (2). Then the general solution y, of the homogeneous 
ODE (1) approaches zero as f goes to infinity, as we know from Sec. 2.4. Practically, 
it is zero after a sufficiently long time. Hence the “transient solution” (6) of (2), 
given by y = yp, + yp, approaches the “steady-state solution” y,. This proves the 
following. 


Steady-State Solution 


After a sufficiently long time the output of a damped vibrating system under a purely 
sinusoidal driving force |see (2)] will practically be a harmonic oscillation whose 
frequency is that of the input. 
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Amplitude of the Steady-State Solution. Practical Resonance 


Whereas in the undamped case the amplitude of y, approaches infinity as w approaches 
@o, this will not happen in the damped case. In this case the amplitude will always be 
finite. But it may have a maximum for some w depending on the damping constant c. 
This may be called practical resonance. It is of great importance because if c is not too 
large, then some input may excite oscillations large enough to damage or even destroy 
the system. Such cases happened, in particular in earlier times when less was known about 
resonance. Machines, cars, ships, airplanes, bridges, and high-rising buildings are vibrating 
mechanical systems, and it is sometimes rather difficult to find constructions that are 
completely free of undesired resonance effects, caused, for instance, by an engine or by 
strong winds. 
To study the amplitude of y, as a function of w, we write (3) in the form 


(13) yp(t) = C* cos (wt — 1). 


C* is called the amplitude of y, and 7 the phase angle or phase lag because it measures 
the lag of the output behind the input. According to (5), these quantities are 


Fo 
C¥(w) = Va? + b2 = 5 
V mwa — ow)? + wc? 
(14) 


WC 


tan 7 (@) a 
o)=—-= : 
* a mwa = w) 


Let us see whether C*(w) has a maximum and, if so, find its location and then its size. 
We denote the radicand in the second root in C* by R. Equating the derivative of C* to 
zero, we obtain 


o = Fo(-48-9”) [2m2(we — w)(—2) + 2c]. 
The expression in the brackets [. . .] is zero if 
(15) c= 2m*(we — w”) — (we = k/m). 
By reshuffling terms we have 
Imo" = Ima" — c* = Imk — c*. 
The right side of this equation becomes negative if c? > 2mk, so that then (15) has no 


real solution and C* decreases monotone as w increases, as the lowest curve in Fig. 57 
shows. If c is smaller, c? < 2mk, then (15) has a real solution w = @max, Where 


(15*) Omax = 0 — —>- 


From (15*) we see that this solution increases as c decreases and approaches wo as c 
approaches zero. See also Fig. 57. 
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The size of C*(@max) is obtained from (14), with w= aa given by (15*). For this 


w* we obtain in the second radicand in (14) from (15*) 


4 2 
2.2 2 2 ¢ 2 or, 2 c 2 
m(W9 — Wmax)” = —> and OmaxC = (03 = Je ; 
4m 2m 


The sum of the right sides of these two formulas is 
(c4 + 4m?wec? — 2c4)/(4m?) = c?2(4m?w% — c)/(4m?). 
Substitution into (14) gives 


2mF, 0 


cV 4m20% -< 


(16) C(@max) a 


We see that C*(Wmax) is always finite when c > 0. Furthermore, since the expression 
c74m we —ct= c2(4mk = c?) 


in the denominator of (16) decreases monotone to zero as c (<2mk) goes to zero, the maximum 
amplitude (16) increases monotone to infinity, in agreement with our result in Case 1. Figure 57 
shows the amplification C*/Fo (ratio of the amplitudes of output and input) as a function of 


w@ form = 1,k = 1, hence wp = 1, and various values of the damping constant c. 
Figure 58 shows the phase angle (the lag of the output behind the input), which is less 
than 77/2 when w < wo, and greater than 77/2 for w > wo. 


Fig. 57. Amplification C*/Fp as a function of 
w for m = 1,k = 1, and various values of the 


damping constant c 
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Fig. 58. Phase lag 7) as a function of w for 
m = 1,k = 1, thus wo = 1, and various values 
of the damping constant c 


PROBLEM SET 27-8 


1. WRITING REPORT. Free and Forced Vibrations. 
Write a condensed report of 2-3 pages on the most 
important similarities and differences of free and forced 
vibrations, with examples of your own. No proofs. 


2. Which of Probs. 1-18 in Sec. 2.7 (with x = time 1) 
can be models of mass—spring systems with a harmonic 
oscillation as steady-state solution? 


3-7 


STEADY-STATE SOLUTIONS 


Find the steady-state motion of the mass—spring system 
modeled by the ODE. Show the details of your work. 


3. y 
4. y 


” 


” 


+ 6y’ + 8y = 42.5 cos 2t 


2.5y’ + 10y = —13.6 sin 4t 


5. (D? + D + 4.251)y = 22.1 cos 4.5¢ 
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6. (D2 + 4D + 3Dy = cost + 4 cos 3t 
7. (4D? + 12D + 9Dy = 225 — 75 sin 3t 


8-15 | TRANSIENT SOLUTIONS 


Find the transient motion of the mass—spring system 
modeled by the ODE. Show the details of your work. 


8. 2y" + 4y’ + 6.5y = 4 sin L.5t 
9. y"” + 3y’ + 3.25y = 3cost 
10. y" + 16y = 56 cos 4t 

11. (D? + 2/)y = cos V2t + sinV2t 

12. (D2 + 2D + 5Dy = 4 cost + 8 sint 
13. (D2 + Dy = cos at, w? # 1 

14. (D? + ly = 5e ‘cost 

15. (D? + 4D + 8/)y = 2 cos 2r + sin 2t 


1.5 sin t 


16-20; INITIAL VALUE PROBLEMS 


Find the motion of the mass—spring system modeled by the 
ODE and the initial conditions. Sketch or graph the solution 
curve. In addition, sketch or graph the curve of y — y, to 
see when the system practically reaches the steady state. 


16. y" + 25y = 24sint, y0)=1, y'O)=1 
17. (D? 4Dy = sint 4 4 sin 3r + 2 sin St, 
y0)=0, yYO=% 


18. (D? + 8D + 17Dy = 474.5 sin 0.5t, y(0) = —5.4, 
y'(0) = 9.4 
19, (D2 + 2D + 2Ny =e" sindt, yO) = 0, 
t 
y(O)=1 


20. (D? + 5I)y =cos amt — sinat, y(0) =0, y’(0) =0 


21. Beats. Derive the formula after (12) from (12). Can 
we have beats in a damped system? 


22. Beats. Solve y” + 25y = 99 cos 4.9t, (0) = 2, 
y’(0) = 0. How does the graph of the solution change 
if you change (a) y(0), (b) the frequency of the driving 
force? 


23. TEAM EXPERIMENT. Practical Resonance. 
(a) Derive, in detail, the crucial formula (16). 
(b) By considering dC*/dc show that C*(@max) in- 
creases as c (S V2mk) decreases. 
(c) Illustrate practical resonance with an ODE of your 
own in which you vary c, and sketch or graph 
corresponding curves as in Fig. 57. 
(d) Take your ODE with c fixed and an input of two 
terms, one with frequency close to the practical 
resonance frequency and the other not. Discuss and 
sketch or graph the output. 
(e) Give other applications (not in the book) in which 
resonance is important. 


24. Gun barrel. Solve y” + y=1- 17/7? if OS 
t = 7 and 0 if t+; here, y(0) = 0, y’(0) = 0. This 
models an undamped system on which a force F acts 
during some interval of time (see Fig. 59), for instance, 
the force on a gun barrel when a shell is fired, the barrel 
being braked by heavy springs (and then damped by a 
dashpot, which we disregard for simplicity). Hint: At 77 
both y and y’ must be continuous. 


m=1 k=1 
“ 
Fig. 59. Problem 24 


25. CAS EXPERIMENT. Undamped Vibrations. 
(a) Solve the initial value problem y” + y = cos ot, 
wo #1, y(0) = 0, y'(0) = 0. Show that the solution 
can be written 


2 ay 4 nd — 
yO = i=a2 sin [5 (1 + w)f] sin [5 (1 — o)¢]. 


(b) Experiment with the solution by changing w to 
see the change of the curves from those for small 
w (0) to beats, to resonance, and to large values of 
w (see Fig. 60). 
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Fig. 60. Typical solution curves in CAS Experiment 25 
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2.9 Modeling: Electric Circuits 


Designing good models is a task the computer cannot do. Hence setting up models has 
become an important task in modern applied mathematics. The best way to gain experience 
in successful modeling is to carefully examine the modeling process in various fields and 
applications. Accordingly, modeling electric circuits will be profitable for all students, 
not just for electrical engineers and computer scientists. 

Figure 61 shows an RLC-circuit, as it occurs as a basic building block of large electric 
networks in computers and elsewhere. An RLC-circuit is obtained from an RL-circuit by 
adding a capacitor. Recall Example 2 on the RL-circuit in Sec. 1.5: The model of the 
RL-circuit is LI’ + RI = E(t). It was obtained by KVL (Kirchhoff’s Voltage Law)’ by 
equating the voltage drops across the resistor and the inductor to the EMF (electromotive 
force). Hence we obtain the model of the RLC-circuit simply by adding the voltage drop 
Q/C across the capacitor. Here, C F (farads) is the capacitance of the capacitor. Q coulombs 
is the charge on the capacitor, related to the current by 


I(t) = “ equivalently O(t) = | I(t) dt. 


See also Fig. 62. Assuming a sinusoidal EMF as in Fig. 61, we thus have the model of 
the RLC-circuit 


E(t) = E, sin ot 


Fig. 61. RLC-circuit 


Name Symbol Notation Unit Voltage Drop 
Ohm’s Resistor WA R  Ohm’s Resistance ohms (Q) RI 
Inductor “VSST-  L Inductance henrys (H) L ue 


Q 


Capacitor —j Ke 


Capacitance farads (F) Q/C 


Fig. 62. Elements in an RLC-circuit 


“GUSTAV ROBERT KIRCHHOFF (1824-1887), German physicist. Later we shall also need Kirchhoff’s 
Current Law (KCL): 

At any point of a circuit, the sum of the inflowing currents is equal to the sum of the outflowing currents. 

The units of measurement of electrical quantities are named after ANDRE MARIE AMPERE (1775-1836), 
French physicist, CHARLES AUGUSTIN DE COULOMB (1736-1806), French physicist and engineer, 
MICHAEL FARADAY (1791-1867), English physicist, JOSEPH HENRY (1797-1878), American physicist, 
GEORG SIMON OHM (1789-1854), German physicist, and ALESSANDRO VOLTA (1745-1827), Italian 
physicist. 
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(1’) Ll’ + RI+ fra = E(t) = Eo sin ot. 


This is an “integro-differential equation.” To get rid of the integral, we differentiate (1’) 
with respect to t, obtaining 


1 
(1) LI" + RI’ + a = E'(t) = Eow cos at. 


This shows that the current in an RLC-circuit is obtained as the solution of this 
nonhomogeneous second-order ODE (1) with constant coefficients. 
In connection with initial value problems, we shall occasionally use 


a”) LQ" + RQ" + =O = Fld, 


obtained from (1’) and J = Q'. 


Solving the ODE (1) for the Current in an RLC-Circuit 


A general solution of (1) is the sum J = J; + Ip, where J, is a general solution of the 
homogeneous ODE corresponding to (1) and J, is a particular solution of (1). We first 
determine J, by the method of undetermined coefficients, proceeding as in the previous 
section. We substitute 


(2) Ip = acos wt + b sin wt 
i = w(—a sin wt + bcos wf) 
ng ‘ 
Ip = @°(—a cos wt — b sin at) 


into (1). Then we collect the cosine terms and equate them to Egw cos wt on the right, 
and we equate the sine terms to zero because there is no sine term on the right, 


Lw*(—a) + Rab + a/C = Eow (Cosine terms) 

Lw*(—b) + Ra(—a) + b/C = 0 (Sine terms). 
Before solving this system for a and b, we first introduce a combination of L and C, called 
the reactance 


(3) SS Oh =, 
@ 


Dividing the previous two equations by w, ordering them, and substituting S' gives 


—Ra — Sb = 0. 
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We now eliminate b by multiplying the first equation by S and the second by R, and 
adding. Then we eliminate a by multiplying the first equation by R and the second by 
—S, and adding. This gives 

—(S2. + Ra = ES, — (R*. + S*)b = EoR. 


We can solve for a and b, 


—EoS EoR 
Pas R? + Ss? 


(4) a= 


Equation (2) with coefficients a and b given by (4) is the desired particular solution J, of 
the nonhomogeneous ODE (1) governing the current J in an RLC-circuit with sinusoidal 
electromotive force. 

Using (4), we can write J, in terms of “physically visible” quantities, namely, amplitude 
Ig and phase lag 0 of the current behind the EMF, that is, 


(5) Ip(t) = Ip sin (wt — 8) 


where [see (14) in App. A3.1] 


E 
Io = Va? + 62 = ——— , tang=—“* = 
VR? + S? b 


ain 


The quantity V R? + $7 is called the impedance. Our formula shows that the impedance 
equals the ratio Eo/Ip. This is somewhat analogous to E/J = R (Ohm’s law) and, because 
of this analogy, the impedance is also known as the apparent resistance. 

A general solution of the homogeneous equation corresponding to (1) is 


i= ce" + coe? 


where A, and Ag are the roots of the characteristic equation 


R 1 
M+—A+—-=0. 
L LC 2 
We can write these roots in the form Ay = —a + B and Ag = —a — B, where 
R | a oe 4L 
a=—, B= R? . 
2L 402, LC 2L Cc 


Now in an actual circuit, R is never zero (hence R > 0). From this it follows that J), 
approaches zero, theoretically as t — oo, but practically after a relatively short time. Hence 
the transient current / = /;, + Ip tends to the steady-state current /,, and after some time 
the output will practically be a harmonic oscillation, which is given by (5) and whose 
frequency is that of the input (of the electromotive force). 
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RLC-Circuit 


Find the current /(t) in an RLC-circuit with R = 11 Q (ohms), L = 0.1 H (henry), C = 10~? F (farad), which 
is connected to a source of EMF E(t) = 110 sin (60 - 27rt) = 110 sin 377 ¢ (hence 60 Hz = 60 cycles/ sec, the 
usual in the U.S. and Canada; in Europe it would be 220 V and 50 Hz). Assume that current and capacitor 
charge are 0 when ¢ = 0. 


Solution. Step 1. General solution of the homogeneous ODE. Substituting R, L, C and the derivative E’ (t) 
into (1), we obtain 


0." + 111’ + 1007 = 110 + 377 cos 377t. 
Hence the homogeneous ODE is 0.17” + 11/' + 100/ = 0. Its characteristic equation is 
0.142 + 114 + 100 = 0. 
The roots are Ay = —10 and Ay = —100. The corresponding general solution of the homogeneous ODE is 
—100¢ 


I(t) = ce 1% + cee 


Step 2. Particular solution I, of (1). We calculate the reactance S = 37.7 — 0.3 = 37.4 and the steady-state 
current 


[,(t) = acos 377t + b sin 377t 
with coefficients obtained from (4) (and rounded) 


—110 - 37.4 110-11 
a= eee = =271, = quad) 1 Saad = 0.796. 
11° + 37.4 11° + 37.4 
Hence in our present case, a general solution of the nonhomogeneous ODE (1) is 


(6) I(t) = cye 1! + cge~ 10 — 2.71 cos 377t + 0.796 sin 3771. 


Step 3. Particular solution satisfying the initial conditions. How to use Q(0) = 0? We finally determine c, 
and cy from the in initial conditions 1(0) = 0 and Q(0) = 0. From the first condition and (6) we have 


(7) I(O) = cy + cg — 2.71 = 0, hence co = 2.71 — cy. 


We turn to Q(0) = 0. The integral in (1’) equals f7 dt = Q(4); see near the beginning of this section. Hence for 
t = 0, Eq. ai becomes 


LI'(0) +R: 0=0, so that (0) = 0. 


Differentiating (6) and setting t = 0, we thus obtain 


1'(0) 10cy — 100cg + 0 + 0.796 - 377 = 0, hence by (7), —10cy = 100(2.71 — cy) — 300.1. 


The solution of this and (7) is cy = —0.323, cg = 3.033. Hence the answer is 
I(t) = —0.323e7 1% + 3.033e7 10% — 2.71 cos 377t + 0.796 sin 3771. 


You may get slightly different values depending on the rounding. Figure 63 shows J(f) as well as [,(¢), which 
practically coincide, except for a very short time near t = 0 because the exponential terms go to zero very rapidly. 
Thus after a very short time the current will practically execute harmonic oscillations of the input frequency 
60 Hz = 60 cycles/ sec. Its maximum amplitude and phase lag can be seen from (5), which here takes the form 


Ip(t) = 2.824 sin (377t — 1.29). | 
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Fig. 63. Transient (upper curve) and steady-state currents in Example 1 


Analogy of Electrical and Mechanical Quantities 


Entirely different physical or other systems may have the same mathematical model. 
For instance, we have seen this from the various applications of the ODE y’ = ky in 
Chap. 1. Another impressive demonstration of this unifying power of mathematics is 
given by the ODE (1) for an electric RLC-circuit and the ODE (2) in the last section for 
a mass-—spring system. Both equations 


1 
LI" + RI + oc! = Eqw cos wt and my” + cy’ + ky = Focos ot 


are of the same form. Table 2.2 shows the analogy between the various quantities involved. 
The inductance L corresponds to the mass m and, indeed, an inductor opposes a change 
in current, having an “inertia effect” similar to that of a mass. The resistance R corresponds 
to the damping constant c, and a resistor causes loss of energy, just as a damping dashpot 
does. And so on. 

This analogy is strictly quantitative in the sense that to a given mechanical system we 
can construct an electric circuit whose current will give the exact values of the displacement 
in the mechanical system when suitable scale factors are introduced. 

The practical importance of this analogy is almost obvious. The analogy may be used 
for constructing an “electrical model” of a given mechanical model, resulting in substantial 
savings of time and money because electric circuits are easy to assemble, and electric 
quantities can be measured much more quickly and accurately than mechanical ones. 


Table 2.2 Analogy of Electrical and Mechanical Quantities 


Electrical System Mechanical System 
Inductance L Mass m 
Resistance R Damping constant c 
Reciprocal 1/C of capacitance Spring modulus k 


Driving force Fo cos wt 


Derivative Egw cos wt of | 
electromotive force 


Current J(t) Displacement y(t) 
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Related to this analogy are transducers, devices that convert changes in a mechanical 
quantity (for instance, in a displacement) into changes in an electrical quantity that can 
be monitored; see Ref. [GenRef11] in App. 1. 


PROBLEM SET 2-9 


1-6| RLC-CIRCUITS: SPECIAL CASES 


1. RC-Circuit. Model the RC-circuit in Fig. 64. Find the 
current due to a constant E. 


R 


E(t) 
Cc 
Fig. 64. RC-circuit 


Current I(t) 


c 


t 


Fig. 65. Current 1 in Problem 1 


2. RC-Circuit. Solve Prob. 1 when E = Egsin wt and 
R, C, Eo, and w are arbitrary. 

3. RL-Circuit. Model the RL-circuit in Fig. 66. Find a 
general solution when R, L, E are any constants. Graph 
or sketch solutions when L = 0.25 H, R = 10Q, and 


E=48V. 
R 
E(t) 
L 
Fig. 66. RL-circuit 
Current I(t) 
5 
4 be 
3 Pex 
2 Los. 
1 
| | | L 


l 
0 0.02 0.04 0.06 0.08 0.1 t 
Fig. 67. Currents in Problem 3 


4. RL-Circuit. Solve Prob. 3 when E = Eg sin wt and R, 
L, Eo, and are arbitrary. Sketch a typical solution. 


Current I(t) 


A ae q 7 f 
ey 


Fig. 68. Typical current | = e~°" + sin(t — $77) 
in Problem 4 


-1F 


5. LC-Circuit. This is an RLC-circuit with negligibly 
small R (analog of an undamped mass-spring system). 
Find the current when L = 0.5 H, C = 0.005 F, and 
E = sint V, assuming zero initial current and charge. 


‘| L 
E(t) 


Fig. 69. LC-circuit 


6. LC-Circuit. Find the current when L=0.5H, 
C = 0.005 F, E = 21 V, and initial current and charge 
Zero. 
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7. Tuning. In tuning a stereo system to a radio station, 
we adjust the tuning control (turn a knob) that changes 
C (or perhaps L) in an RLC-circuit so that the amplitude 
of the steady-state current (5) becomes maximum. For 
what C will this happen? 


8-14 | Find the steady-state current in the RLC-circuit 

in Fig. 61 for the given data. Show the details of your work. 
8 R= 40,L=05H,C =0.1F,E = 500 sin 2r V 
9. R=40,L =0.1H,C = 0.05 F, E = 110 V 

10. R=20,L =1H,C = oF, E = 157sin3¢V 
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11. 


12. 
13. 


14. 


15. 


R=120,L=04H,C =a6F, 
E = 220sin 10 V 


R=020,L =0.1H,C = 2F,E = 220sin314tV 
R=12,L=12H,C=%- 10° 3F, 
E = 12,000 sin 251 V 


Prove the claim in the text that if R # 0 (hence R > 0), 
then the transient current approaches I, as f—> ©. 


Cases of damping. What are the conditions for an 
RLC-circuit to be (I) overdamped, (ID) critically damped, 
(II) underdamped? What is the critical resistance Rerit 
(the analog of the critical damping constant 2V/mk)? 


Solve the initial value problem for the RLC- 
circuit in Fig. 61 with the given data, assuming zero initial 
current and charge. Graph or sketch the solution. Show the 
details of your work. 


20. 


~R=60,L 
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. R=80,L =0.2H,C = 12.5- 1073 F, 


E = 100 sin 10t V 


1H, C = 0.04 F, 
E = 600 (cost + 4sin t) V 


.R=180,L=1H,C = 125-1073, 


E = 820 cos 10t V 


WRITING REPORT. Mechanic-Electric Analogy. 
Explain Table 2.2 in a 1-2 page report with examples, 
e.g., the analog (with L = 1 H) of a mass-spring system 
of mass 5 kg, damping constant 10 kg/sec, spring constant 
60 kg/sec”, and driving force 220 cos 10t kg/sec. 
Complex Solution Method. Solve Li” + Ri’ + 
1/C = Ege’, i= V—I, by substituting I, = Ke’ 
(K unknown) and its derivatives and taking the real 
part I of the solution Ls Show agreement with (2), (4). 
Hint: Use (11) et = cos wt + isinat; cf. Sec. 2.2, 
and i? = —1. 


2.10 Solution by Variation of Parameters 


We continue our discussion of nonhomogeneous linear ODEs, that is 


(1) 


y" + pay’ + q@y = rx). 


In Sec. 2.6 we have seen that a general solution of (1) is the sum of a general solution yy, 
of the corresponding homogeneous ODE and any particular solution y, of (1). To obtain y, 
when r(x) is not too complicated, we can often use the method of undetermined coefficients, 
as we have shown in Sec. 2.7 and applied to basic engineering models in Secs. 2.8 and 2.9. 
However, since this method is restricted to functions r(x) whose derivatives are of a form 
similar to r(x) itself (powers, exponential functions, etc.), it is desirable to have a method valid 
for more general ODEs (1), which we shall now develop. It is called the method of variation 
of parameters and is credited to Lagrange (Sec. 2.1). Here p, g, r in (1) may be variable 
(given functions of x), but we assume that they are continuous on some open interval J. 
Lagrange’s method gives a particular solution y, of (1) on J in the form 


(2) 


if r 
Vp) = -y, |= dx + v2 | dx 


where yy, yg form a basis of solutions of the corresponding homogeneous ODE 


(3) 


on J, and W is the Wronskian of yj, ye, 


(4) 


CAUTION! 


W = yiya — yeyl 


y" + p@y’ + q@y = 0 


(see Sec. 2.6). 


The solution formula (2) is obtained under the assumption that the ODE 


is written in standard form, with y” as the first term as shown in (1). If it starts with 


f(xy", divide first by f(x). 


100 


EXAMPLE 1 


CHAP. 2. Second-Order Linear ODEs 


The integration in (2) may often cause difficulties, and so may the determination of 
y1, ya if (1) has variable coefficients. If you have a choice, use the previous method. It is 
simpler. Before deriving (2) let us work an example for which you do need the new 
method. (Try otherwise.) 


Method of Variation of Parameters 


Solve the nonhomogeneous ODE 


1 


u” 
y +y=secx = —. 
¥ # cos x 


Solution. A basis of solutions of the homogeneous ODE on any interval is y; = cos x, yo = sin x. This gives 
the Wronskian 


W(y1, y2) = cos x cos x — sin.x (—sinx) = 1. 


From (2), choosing zero constants of integration, we get the particular solution of the given ODE 


Yp = —cos x sin x sec x dx + sin x feos x sec x dx 


= cos x In |cos x| + x sinx (Fig. 70) 
Figure 70 shows y, and its first term, which is small, so that x sin x essentially determines the shape of the curve 
of yp. (Recall from Sec. 2.8 that we have seen x sin x in connection with resonance, except for notation.) From 
Yp and the general solution y;, = cyyy + cgyg of the homogeneous ODE we obtain the answer 
Y= Yn + Yp =(a1 + In |cos x|) cos x + (cg + x) sin x. 
Had we included integration constants —c,,cg in (2), then (2) would have given the additional 


cy cos x + cg sinx = cyyy + Coyo, that is, a general solution of the given ODE directly from (2). This will 
always be the case. fai] 
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2 \4 6 8 10 12 x 
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Fig. 70. Particular solution y, and its first term in Example 1 


Idea of the Method. Derivation of (2) 


What idea did Lagrange have? What gave the method the name? Where do we use the 
continuity assumptions? 
The idea is to start from a general solution 


ya(x) = cyyi(x) + ceye(x) 
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of the homogeneous ODE (3) on an open interval J and to replace the constants (“the 
parameters”) c, and cy by functions u(x) and u(x); this suggests the name of the method. 
We shall determine uw and vu so that the resulting function 


(5) Yp(x) = ula)yi(x) + v@)ye) 

is a particular solution of the nonhomogeneous ODE (1). Note that y, exists by Theorem 
3 in Sec. 2.6 because of the continuity of p and q on J. (The continuity of r will be used 
later.) 


We determine u and v by substituting (5) and its derivatives into (1). Differentiating (5), 
we obtain 


, , , , , 

Yp = Uy, + Uy, + UV yo + Vo. 
Now y, must satisfy (1). This is one condition for two functions u and v. It seems plausible 
that we may impose a second condition. Indeed, our calculation will show that we can 
determine u and v such that y, satisfies (1) and wu and v satisfy as a second condition the 
equation 
(6) wy, + v'y2 = 0. 
This reduces the first derivative Yp to the simpler form 

, , , 

(7) Yp = uy, + Vye. 
Differentiating (7), we obtain 
(8) Yp = uy, + uy + v' yo + vyg. 


We now substitute y, and its derivatives according to (5), (7), (8) into (1). Collecting 
terms in uw and terms in v, we obtain 


u(y, + pyr + gyi) + v(ye + pys + qy2) + u'y, + v' ys =r. 


Since y, and yg are solutions of the homogeneous ODE (3), this reduces to 


(9a) w'y + v'y2 
Equation (6) is 

(9b) u'y, + v' yo = 0. 

This is a linear system of two algebraic equations for the unknown functions u’ and v’. 
We can solve it by elimination as follows (or by Cramer’s rule in Sec. 7.6). To eliminate 


v’, we multiply (9a) by —yg and (9b) by ys and add, obtaining 


u'(yiy2 — yayt) = Yer, thus ou! W = —yor- 


Here, W is the Wronskian (4) of y;, yo. To eliminate u’ we multiply (9a) by y, and (9b) 
by —y1 and add, obtaining 
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2. 


v'(yiy2 — yayi) = —y1r, 


CHAP. 2. Second-Order Linear ODEs 


thus  v’W=yyr. 


Since yj, yg form a basis, we have W # 0 (by Theorem 2 in Sec. 2.6) and can divide by W, 


r a 
(10) peo! wo 
W W 
By integration, 
: 
u = -|%2 dx, v= [as 
W W 


These integrals exist because r(x) is continuous. Inserting them into (5) gives (2) and 


completes the derivation. 
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GENERAL SOLUTION 


Solve the given nonhomogeneous linear ODE by variation 


of 


parameters or undetermined coefficients. Show the 


details of your work. 


1. y” + 9y = sec 3x 

2. y” + Oy = ese 3x 

3. x7y" — Qxy’ + 2y = x3 sinx 

4. y” — 4y' + 5y = e* esc x 

5. y" +y =cosx — sinx 

6. (D2 + 6D + 9Dy = 16e73"/(x? + 1) 
7. (D® — 4D + 4Dy = 6e2"/x* 

8. (D2 + 41)y = cosh 2x 

9, (D? — 2D + Dy = 35x9/e* 
10. (D? + 2D + 2Dy = 4e~* sec? x 


. (x2D? — 4xD + 6Dy = 21x74 

(D® — Dy = 1/coshx 

. (x2D2 + xD — 91)y = 48x° 

. TEAM PROJECT. Comparison of Methods. Inven- 
tion. The undetermined-coefficient method should be 
used whenever possible because it is simpler. Compare 
it with the present method as follows. 
(a) Solve y” + 4y’ + 3y = 65 cos 2x by both methods, 
showing all details, and compare. 
(b) Solve y” — 2y’ + y =r, t+ ra, ry = 35x7/e* rg = 
x” by applying each method to a suitable function on 
the right. 
(c) Experiment to invent an undetermined-coefficient 
method for nonhomogeneous Euler—Cauchy equations. 


CHAPTER -2- REVIEW QUESTIONS AND PROBLEMS 


1. 


Why are linear ODEs preferable to nonlinear ones in 
modeling? 


. What does an initial value problem of a second-order 
ODE look like? Why must you have a general solution 
to solve it? 


. By what methods can you get a general solution of a 
nonhomogeneous ODE from a general solution of a 
homogeneous one? 

. Describe applications of ODEs in mechanical systems. 
What are the electrical analogs of the latter? 


. What is resonance? How can you remove undesirable 
resonance of a construction, such as a bridge, a ship, 
or a machine? 


. What do you know about existence and uniqueness of 
solutions of linear second-order ODEs? 
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GENERAL SOLUTION 


Find a general solution. Show the details of your calculation. 
7. dy” + 32y' + 63y =0 


u” 


y +y' — y= 0 

y" + 6y’ + 34y =0 

y” + 0.20y’ + 0.17y = 0 

. (00D? — 160D + 64/)y = 0 

. (D? + 4D + 47771)y = 0 

. (x2D? + 2xD — 12Dy = 0 
(x2D? + xD — 9Dy = 0 

. (2D? — 3D — 21)y = 13 — 2x? 
. (D? + 2D + 21)y = 3e~* cos 2x 
(4D? — 12D + 9J)y = 2e1** 

» yy” = 2y'? 


Summary of Chapter 2 


19-22 | INITIAL VALUE PROBLEMS 


Solve the problem, showing the details of your work. 
Sketch or graph the solution. 


19, y” + l6y = 17e”, y(0)=6, y'(0) = -2 

20. y” — 3y’ + 2y = 10sinx, y0)=1, y'(0) = -6 

21. (x?D? + xD — Dy = 16x*, yA)=-1, y=1 

22. (x2D? + 15xD + 49Dy = 0, y(1) = 2, 

ydj==11 

23-30} APPLICATIONS 

23. Find the steady-state current in the RLC-circuit in Fig. 71 
when R= 2 k0, (20000), L=1H,C=4- 10-3F, and 
E = 110 sin 415r V (66 cycles/sec). 

24. Find a general solution of the homogeneous linear 
ODE corresponding to the ODE in Prob. 23. 

25. Find the steady-state current in the RLC-circuit 
in Fig. 71 when R = 50Q0, L = 30H, C = 0.025 F, 
E = 200 sin 4r V. 


C 
4 3 
E(t) 
Fig. 71. RLC-circuit 
26. Find the current in the RLC-circuit in Fig. 71 


when R= 400, L=04H, C=10°*F, E= 
220 sin 314r V (50 cycles/sec). 
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27. Find an electrical analog of the mass—spring system 
with mass 4 kg, spring constant 10 kg/ sec”, damping 
constant 20 kg/sec, and driving force 100 sin 4¢ nt. 


28. Find the motion of the mass—spring system in Fig. 72 
with mass 0.125 kg, damping 0, spring constant 
1.125 kg/sec®, and driving force cos t — 4 sin f nt, ass- 
uming zero initial displacement and velocity. For what 
frequency of the driving force would you get resonance? 


k Spring 
m Mass 
c Dashpot 


Fig. 72. Mass—spring system 


29. Show that the system in Fig. 72 with m = 4,c = 0, 
k = 36, and driving force 61 cos 3.1f exhibits beats. 
Hint: Choose zero initial conditions. 


30. In Fig. 72, letm = 1kg,c = 4kg/sec,k = 24kg/sec?, 
and r(t) = 10 cos wt nt. Determine w such that you 
get the steady-state vibration of maximum possible 
amplitude. Determine this amplitude. Then find the 
general solution with this w and check whether the results 
are in agreement. 


Second-order linear ODEs are particularly important in applications, for instance, 
in mechanics (Secs. 2.4, 2.8) and electrical engineering (Sec. 2.9). A second-order 
ODE is called linear if it can be written 


(1) y” + p@y’ + aay = r(x) (Sec. 2.1). 


(If the first term is, say, f(x)y”, divide by f(x) to get the “standard form” (1) with 
y” as the first term.) Equation (1) is called homogeneous if r(x) is zero for all x 


considered, usually in some open interval; this is written r(x) = 0. Then 
(2) y” + p@y’ + q@y = 0. 


Equation (1) is called nonhomogeneous if r(x) # 0 (meaning r(x) is not zero for 
some x considered). 
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For the homogeneous ODE (2) we have the important superposition principle (Sec. 
2.1) that a linear combination y = ky, + lyg of two solutions y1, yo is again a solution. 

Two linearly independent solutions y,, yg of (2) on an open interval / form a basis 
(or fundamental system) of solutions on /. and y = cyy, + coy with arbitrary 
constants cj, Cg a general solution of (2) on /. From it we obtain a particular 
solution if we specify numeric values (numbers) for cy and cy, usually by prescribing 
two initial conditions 


(3) yW(Xo) = Ko, y' (xo) = Ky (xo, Ko, Kz given numbers; Sec. 2.1). 


(2) and (3) together form an initial value problem. Similarly for (1) and (3). 
For a nonhomogeneous ODE (1) a general solution is of the form 


(4) y= Yn + Yp (Sec. 2.7). 


Here yp is a general solution of (2) and y, is a particular solution of (1). Such a y, 
can be determined by a general method (variation of parameters, Sec. 2.10) or in 
many practical cases by the method of undetermined coefficients. The latter applies 
when (1) has constant coefficients p and qg, and r(x) is a power of x, sine, cosine, 
etc. (Sec. 2.7). Then we write (1) as 


(3) y" tay’ + by = r(x) (Sec. 2.7). 


AX 
> 


The corresponding homogeneous ODE y’ + ay’ + by = 0 has solutions y = e 
where A is a root of 


(6) M+ adA+b=0. 


Hence there are three cases (Sec. 2.2): 


Case Type of Roots General Solution 

I Distinct real Ay, Ag y = cye*™ + coe?” 

II Double —sa y=(cy + coxye - 

I Complex —sa + iw* y= eo cos w*x + B sin w*x) 


Here w* is used since w is needed in driving forces. 

Important applications of (5) in mechanical and electrical engineering in connection 
with vibrations and resonance are discussed in Secs. 2.4, 2.7, and 2.8. 

Another large class of ODEs solvable “algebraically” consists of the Euler-Cauchy 
equations 


(7) xy" + axy’ + by =0 (Sec. 2.5). 
These have solutions of the form y = x”’, where mis a solution of the auxiliary equation 
(8) m2 + (a- 1)m+b=0. 


Existence and uniqueness of solutions of (1) and (2) is discussed in Secs. 2.6 
and 2.7, and reduction of order in Sec. 2.1. 


CHAPTER 3 


Higher Order Linear ODEs 


The concepts and methods of solving linear ODEs of order n = 2 extend nicely to linear 
ODEs of higher order n, that is, n = 3,4, etc. This shows that the theory explained in 
Chap. 2 for second-order linear ODEs is attractive, since it can be extended in a 
straightforward way to arbitrary n. We do so in this chapter and notice that the formulas 
become more involved, the variety of roots of the characteristic equation (in Sec. 3.2) 
becomes much larger with increasing n, and the Wronskian plays a more prominent role. 

The concepts and methods of solving second-order linear ODEs extend readily to linear 
ODEs of higher order. 

This chapter follows Chap. 2 naturally, since the results of Chap. 2 can be readily 
extended to that of Chap. 3. 


Prerequisite: Secs. 2.1, 2.2, 2.6, 2.7, 2.10. 
References and Answers to Problems: App. | Part A, and App. 2. 


3.1 Homogeneous Linear ODEs 


Recall from Sec. 1.1 that an ODE is of nth order if the nth derivative y = d"y/dx" of 
the unknown function y(x) is the highest occurring derivative. Thus the ODE is of the form 


Fay oy) =0 


where lower order derivatives and y itself may or may not occur. Such an ODE is called 
linear if it can be written 


(1) y™ + py-icy"? + +++ + pi@dy’ + poWdy = r@). 


(For n = 2 this is (1) in Sec. 2.1 with py = p and po = q.) The coefficients po, -+-, py—1 
and the function r on the right are any given functions of x, and y is unknown. y™ has 
coefficient 1. We call this the standard form. (If you have pine”. divide by py(x) 
to get this form.) An mth-order ODE that cannot be written in the form (1) is called 
nonlinear. 

If r(x) is identically zero, r(x) = 0 (zero for all x considered, usually in some open 
interval J), then (1) becomes 


(2) y + pn-i@y? + +++ + pidy’ + poy = 0 
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and is called homogeneous. If r(x) is not identically zero, then the ODE is called 
nonhomogeneous. This is as in Sec. 2.1. 

A solution of an nth-order (linear or nonlinear) ODE on some open interval J is a 
function y = A(x) that is defined and n times differentiable on J and is such that the ODE 
becomes an identity if we replace the unknown function y and its derivatives by h and its 
corresponding derivatives. 

Sections 3.1—3.2 will be devoted to homogeneous linear ODEs and Section 3.3 to 
nonhomogeneous linear ODEs. 


Homogeneous Linear ODE: Superposition Principle, 
General Solution 


The basic superposition or linearity principle of Sec. 2.1 extends to nth order 
homogeneous linear ODEs as follows. 


Fundamental Theorem for the Homogeneous Linear ODE (2) 


For a homogeneous linear ODE (2), sums and constant multiples of solutions on 
some open interval I are again solutions on I. (This does not hold for a 
nonhomogeneous or nonlinear ODE!) 


The proof is a simple generalization of that in Sec. 2.1 and we leave it to the student. 

Our further discussion parallels and extends that for second-order ODEs in Sec. 2.1. 
So we next define a general solution of (2), which will require an extension of linear 
independence from 2 to n functions. 


General Solution, Basis, Particular Solution 


A general solution of (2) on an open interval / is a solution of (2) on J of the form 


(3) MED) S imils) ar 9° ar Gran Ae, (cy,°**, Cy arbitrary) 


where yj,°°*, Yn is a basis (or fundamental system) of solutions of (2) on J; that 
is, these solutions are linearly independent on J, as defined below. 

A particular solution of (2) on J is obtained if we assign specific values to the 
n constants cy,°**, Cy in (3). 


Linear Independence and Dependence 


Consider n functions y;(x),°+:, y,(x) defined on some interval I. 
These functions are called linearly independent on / if the equation 


(4) kyyi(x) + ++: + knyn(x) = 0 on I 


implies that all ky,---, ky, are zero. These functions are called linearly dependent 
on J if this equation also holds on J for some ky,---, ky, not all zero. 
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If and only if y,,---, y, are linearly dependent on J, we can express (at least) one of 
these functions on / as a “linear combination” of the other n — 1 functions, that is, as a 
sum of those functions, each multiplied by a constant (zero or not). This motivates the 
term “linearly dependent.” For instance, if (4) holds with ky # 0, we can divide by ky 
and express y, as the linear combination 


1 
v= ~ 5, 292 toes + kn Yn)- 


Note that when n = 2, these concepts reduce to those defined in Sec. 2.1. 


Linear Dependence 
Show that the functions yy = x, ye = 5x, y3 = 2x are linearly dependent on any interval. 


Solution. yo = Oy, + 2.5y3. This proves linear dependence on any interval. 3] 


Linear Independence 


Show that yy = x, yg = x’, y3 = x? are linearly independent on any interval, for instance, on —1 S x S 2. 


Solution. Equation (4) is kyx + kox? + k3x°? = 0. Taking (a) x 1, (b) x = I, (c) x = 2, we get 
(a) —k, + kg — k3 = 0, (b) ky + kg + kz = 0, (c) 2ky + 4ky + 8k3 = 0. 


ky = 0 from (a) + (b). Then kg = 0 from (c) —2(b). Then ky = 0 from (b). This proves linear independence. 
A better method for testing linear independence of solutions of ODEs will soon be explained. a) 


General Solution. Basis 
Solve the fourth-order ODE 
yl¥ — 5y” + 4y =0 (where y'¥ = d*y/dx*). 


Solution. As in Sec. 2.2 we substitute y = e*”. Omitting the common factor e*”, we obtain the characteristic 
equation 


M527 +4=0. 


This is a quadratic equation in 4» = A”, namely, 


Bo Sue +4 = (uw 1) — 4) = 0. 


The roots are w = | and 4. Hence A = —2, —1, 1,2. This gives four solutions. A general solution on any 
interval is 


y = cye 2" + coe * + cge™ + ce?” 


provided those four solutions are linearly independent. This is true but will be shown later. is] 


Initial Value Problem. Existence and Uniqueness 


An initial value problem for the ODE (2) consists of (2) and n initial conditions 
(5) y(Xo) = Ko, y' (xo) = Ki, ree) eg) = eae 


with given xg in the open interval J considered, and given Ko,-::, Ky,-1. 
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In extension of the existence and uniqueness theorem in Sec. 2.6 we now have the 
following. 


Existence and Uniqueness Theorem for Initial Value Problems 


If the coefficients po(x), +++, Py—1(x) of (2) are continuous on some open interval I 
and Xq is in I, then the initial value problem (2), (5) has a unique solution y(x) on I. 


Existence is proved in Ref. [All] in App. 1. Uniqueness can be proved by a slight 
generalization of the uniqueness proof at the beginning of App. 4. 


Initial Value Problem for a Third-Order Euler—Cauchy Equation 


Solve the following initial value problem on any open interval J on the positive x-axis containing x = 1. 


x3y" 3x2 y" 4 oxy’ — by = 0, yl) = 2, yd) = 1, y"() = -4, 


Solution. Step 1. General solution. As in Sec. 2.5 we try y = x". By differentiation and substitution, 
Pp y.) ry: 


mim — 1)(m — 2)x"™ — 3m(m — 1)x™ + 6mx™ — 6x”™ = 0. 


Dropping x” and ordering gives m? — 6m? + 11m — 6 = O. If we can guess the root m = 1. We can divide 
by m — 1 and find the other roots 2 and 3, thus obtaining the solutions x, x2, x3, which are linearly independent 
on J (see Example 2). [In general one shall need a root-finding method, such as Newton’s (Sec. 19.2), also 
available in a CAS (Computer Algebra System).] Hence a general solution is 


y= Cpe + Cox" = 3x" 


valid on any interval J, even when it includes x = 0 where the coefficients of the ODE divided by x? (to have 
the standard form) are not continuous. 


Step 2. Particular solution. The derivatives are y’ = cy + 2cox + 3¢3x and y” = 2cg + 6c3x. From this, and 
y and the initial conditions, we get by setting x = | 


(a) YA) = ert cat cg = 2 


(b) yl) =c, + 2cg + 3cg = 1 


(c) y"() = 2c2 + 6c3 = —4. 


This is solved by Cramer’s rule (Sec. 7.6), or by elimination, which is simple, as follows. (b) — (a) gives 
(d) cg + 2c3 = —1. Then (c) — 2(d) gives cz = —1. Then (c) gives cg = 1. Finally cy = 2 from (a). 
Answer: y = 2x + x7 = x3, | 


Linear Independence of Solutions. Wronskian 


Linear independence of solutions is crucial for obtaining general solutions. Although it can 
often be seen by inspection, it would be good to have a criterion for it. Now Theorem 2 
in Sec. 2.6 extends from order n = 2 to any n. This extended criterion uses the Wronskian 
W of n solutions yy,---, y, defined as the nth-order determinant 


y1 y2 yi Yn 
! , | f 
Dal y2 aoe Yn 
(6) W01,°°*> Yn) = 
yD ye) oo yD 
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Note that W depends on x since y,,---, y,, do. The criterion states that these solutions 
form a basis if and only if W is not zero; more precisely: 


Linear Dependence and Independence of Solutions 


Let the ODE (2) have continuous coefficients po(x),*** , Pn—1(x) on an open interval 
I. Then n solutions yy, +++ , Yn of (2) on L are linearly dependent on I if and only if their 
Wronskian is zero for some x = xg in I. Furthermore, if W is zero for x = xo, then W 
is identically zero on I. Hence if there is an x, in at which Wis not zero, then y1,°**, Yn, 
are linearly independent on I, so that they form a basis of solutions of (2) on I. 


(a) Let y1,---, yn, be linearly dependent solutions of (2) on J. Then, by definition, there 
are constants kj,---,k, not all zero, such that for all x in J, 


(7) kyyy tts +tkyyy = 0. 
By n — 1 differentiations of (7) we obtain for all x in J 
kiyit s+ +knyn =0 
(8) 
koi? ae gad kyo) = 0. 


(7), (8) is a homogeneous linear system of algebraic equations with a nontrivial solution 
k4,:++, ky. Hence its coefficient determinant must be zero for every x on J, by Cramer’s 
theorem (Sec. 7.7). But that determinant is the Wronskian W, as we see from (6). Hence 
W is zero for every x on I. 


(b) Conversely, if W is zero at an xq in J, then the system (7), (8) with x = xo has a 
solution ky ae ki not all zero, by the same theorem. With these constants we define 
the solution y* = kty1 5 a kityn of (2) on I. By (7), (8) this solution satisfies the 
initial conditions y*(x9) = 0,---, y*™ YX) = 0. But another solution satisfying the 
same conditions is y = 0. Hence y* = y by Theorem 2, which applies since the coefficients 
of (2) are continuous. Together, y* = kfy, + +++ + kiyn, = 0 on I. This means linear 
dependence of y1,°--, y, on J. 


(c) If W is zero at an x9 in J, we have linear dependence by (b) and then W = 0 by (a). 
Hence if W is not zero at an x, in J, the solutions yj,---, y, must be linearly independent 
on I, a 


Basis, Wronskian 


We can now prove that in Example 3 we do have a basis. In evaluating W, pull out the exponential functions 
columnwise. In the result, subtract Column | from Columns 2, 3, 4 (without changing Column 1). Then expand by 
Row 1. In the resulting third-order determinant, subtract Column 1| from Column 2 and expand the result by Row 2: 


e e e ew 1 1 1 1 


—2e"2® ge e 26 =) =] 1 


aN 
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A General Solution of (2) Includes All Solutions 


Let us first show that general solutions always exist. Indeed, Theorem 3 in Sec. 2.6 extends 
as follows. 


Existence of a General Solution 


If the coefficients po(x),*** , Pn—1(x) of (2) are continuous on some open interval I, 
then (2) has a general solution on I. 


We choose any fixed xg in J. By Theorem 2 the ODE (2) has n solutions yy,---, y,, where 
y; satisfies initial conditions (5) with K;_; = 1 and all other K’s equal to zero. Their 
Wronskian at x9 equals 1. For instance, when n = 3, then y,(xo) = 1, ya(xo) = 1, 
y3 (xq) = 1, and the other initial values are zero. Thus, as claimed, 


yi%o) —-yalxo) ~—-ya(Xo) 1 0 0 
W(y1(%0), Yalxo) Y3(X0)) = |yi%o) yalxo) ya(xo)|=]0 1 Of =1. 
yi(%o) y2(%o) y3o)| |0 0 1 
Hence for any n those solutions yj,---, yy, are linearly independent on /, by Theorem 3. 
They form a basis on J, and y = cyy, + ++: + cyyn is a general solution of (2)on/ 


We can now prove the basic property that, from a general solution of (2), every solution 
of (2) can be obtained by choosing suitable values of the arbitrary constants. Hence an 
nth-order linear ODE has no singular solutions, that is, solutions that cannot be obtained 
from a general solution. 


General Solution Includes All Solutions 


If the ODE (2) has continuous coefficients po(x),*** , Py—1(X) on some open interval 
I, then every solution y = Y(x) of (2) on I is of the form 


(9) Y(x) = Cry1(x) + +++ + Cryn@) 


where y1,°**, Yn isa basis of solutions of (2) on IT and Cy, +++, Cy are suitable constants. 


Let Y be a given solution and y = cyyy + ++: + Cyypn a general solution of (2) on J. We 
choose any fixed x9 in J and show that we can find constants c1,---,¢C, for which y and 
its first n — | derivatives agree with Y and its corresponding derivatives at x9. That is, 
we should have at x = xo 


cyyy tots + Cpyn = Y 


ciyp tes + CnYn = Y' 


(10) 
yy P $e t egy OD = yOrD. 
But this is a linear system of equations in the unknowns cy,-:--,Cy. Its coefficient 


determinant is the Wronskian W of yy,---, yy, at xo. Since yy,---, y, form a basis, they 
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are linearly independent, so that W is not zero by Theorem 3. Hence (10) has a unique 
solution cy = Cy,°*+, Cy = Cy (by Cramer’s theorem in Sec. 7.7). With these values we 
obtain the particular solution 

yx) = Cyyi@) + +++ + Cr¥n@) 
on J. Equation (10) shows that y* and its first n — 1 derivatives agree at xg with Y and 
its corresponding derivatives. That is, y* and Y satisfy, at x9, the same initial conditions. 


The uniqueness theorem (Theorem 2) now implies that y* = Y on J. This proves the 
theorem. a 


This completes our theory of the homogeneous linear ODE (2). Note that for n = 2 it is 


identical with that in Sec. 2.6. This had to be expected. 


PROBLEM SET 3-1 


1-6| BASES: TYPICAL EXAMPLES 


To get a feel for higher order ODEs, show that the given 
functions are solutions and form a basis on any interval. 
Use Wronskians. In Prob. 6, x > 0, 


1; 


IAW RwWH 


1, x, x2, x°, y* =0 


7 e*, ee, eo, y” 2y" y’ 2y -0 
. COS X, SiN X, X COS xX, x sin x, yiv + Qy" +y=0 
et xe x70, yl + 12y"+ 48y'+ 64y = 0 


1, e~* cos 2x, e7* sin 2x, y’” + 2y" + 5y’ =0 


1, x2, x4, xy" _ 3xy" + By’ =0 


. TEAM PROJECT. General Properties of Solutions 


of Linear ODEs. These properties are important in 
obtaining new solutions from given ones. Therefore 
extend Team Project 38 in Sec. 2.2 to nth-order ODEs. 
Explore statements on sums and multiples of solutions 
of (1) and (2) systematically and with proofs. 
Recognize clearly that no new ideas are needed in this 
extension from n = 2 to general n. 


LINEAR INDEPENDENCE 


Are the given functions linearly independent or dependent 
on the half-axis x = 0? Give reason. 


8. 


x, (Tne 0 9. tan x, cot x, 1 


. sin? xX, cos” xX, COS 2x 
14, 
16. 


11. e” cos x, e” sin x, e” 
13. sin x, cos x, sin 2x 

2. 2 : 2 
cos” x, sin” x, 277 15. cosh 2x, sinh 2x, e“” 


TEAM PROJECT. Linear Independence and 
Dependence. (a) Investigate the given question about 
a set S of functions on an interval J. Give an example. 
Prove your answer. 

(1) If S contains the zero function, can S be linearly 
independent? 

(2) If S is linearly independent on a subinterval J of J, 
is it linearly independent on /? 

(3) If S is linearly dependent on a subinterval J of J, 
is it linearly dependent on /? 

(4) If S is linearly independent on J, is it linearly 
independent on a subinterval J? 

(5) If S is linearly dependent on /, is it linearly 
independent on a subinterval J? 

(6) If S is linearly dependent on J, and if T contains S, 
is T linearly dependent on J? 

(b) In what cases can you use the Wronskian for 
testing linear independence? By what other means can 
you perform such a test? 


3.2 Homogeneous Linear ODEs 
with Constant Coefficients 


We proceed along the lines of Sec. 2.2, and generalize the results from n = 2 to arbitrary n. 
We want to solve an nth-order homogeneous linear ODE with constant coefficients, 
written as 


(1) ye + an-1y ? + +++ + ay’ + apy = 0 
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where y = d"y/dx", etc. As in Sec. 2.2, we substitute y = e*” to obtain the characteristic 
equation 


(2) NP + ay—AT~P + Haat doy = 0 


of (1). If A is a root of (2), then y = e* is a solution of (1). To find these roots, you may 
need a numeric method, such as Newton’s in Sec. 19.2, also available on the usual CASs. 
For general n there are more cases than for n = 2. We can have distinct real roots, simple 
complex roots, multiple roots, and multiple complex roots, respectively. This will be shown 
next and illustrated by examples. 


Distinct Real Roots 


If all the n roots Ay,---, A, of (2) are real and different, then the 1 solutions 
(3) Se 


constitute a basis for all x. The corresponding general solution of (1) is 
(4) y = ce + ++ + ce”. 


Indeed, the solutions in (3) are linearly independent, as we shall see after the example. 


Distinct Real Roots 
Solve the ODE y” — 2y" — y’ + 2y =0. 


Solution. The characteristic equation is A? — 2a? — A + 2 = 0. It has the roots —1, 1, 2; if you find one 
of them by inspection, you can obtain the other two roots by solving a quadratic equation (explain!). The 
corresponding general solution (4) is y = cye~* + cge” + cge. 


Linear Independence of (3). Students familiar with nth-order determinants may verify 
that, by pulling out all exponential functions from the columns and denoting their product 
by E = exp [Ay +--+ + A,)x], the Wronskian of the solutions in (3) becomes 


et ert ern 
Aye™ Ager?” tee Aner” 
22 2A: 2A 
W=| Aje*” Age?” nar Aner” 
Ate Age ee |e cas 
(5) 
1 1 1 
MN de Xn 
=E| Az de pd 
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The exponential function E is never zero. Hence W = 0 if and only if the determinant on 
the right is zero. This is a so-called Vandermonde or Cauchy determinant.’ It can be 
shown that it equals 


(6) (-1)"" by 
where V is the product of all factors A; — Aj;, with j < k (=n); for instance, when n = 3 


we get —V = —(Aq — Aa)(Ay — Ag)(Ag — Ag). This shows that the Wronskian is not zero 
if and only if all the 1 roots of (2) are different and thus gives the following. 


Basis 


Solutions yy = e™”,-++, yy, = e*”” of (1) (with any real or complex A;’s) form a 


basis of solutions of (1) on any open interval if and only if all n roots of (2) are 
different. 


nt 


Actually, Theorem | is an important special case of our more general result obtained 
from (5) and (6): 


Linear Independence 


Any number of solutions of (1) of the form e*” are linearly independent on an open 
interval I if and only if the corresponding X are all different. 


Simple Complex Roots 


If complex roots occur, they must occur in conjugate pairs since the coefficients of (1) 
are real. Thus, if A = y + iw is a simple root of (2), so is the conjugate A = y — iw, and 
two corresponding linearly independent solutions are (as in Sec. 2.2, except for notation) 


yy = e”” cos wx, yo = e sin wx. 


Simple Complex Roots. Initial Value Problem 


Solve the initial value problem 


y” —y" + 100y' - 100y=0, yO)=4, yO)=11, yy") = —299. 


Solution. The characteristic equation is A? — r+ 100A — 100 = 0. It has the root 1, as can perhaps be 
seen by inspection. Then division by A — | shows that the other roots are +107. Hence a general solution and 
its derivatives (obtained by differentiation) are 


y = cye” + Acos 10x + B sin 10x, 
y’ = cye” — 10A sin 10x + 10B cos 10x, 


y” = cye” — 100A cos 10x — 100B sin 10x. 


‘ALEXANDRE THEOPHILE VANDERMONDE (1735-1796), French mathematician, who worked on 
solution of equations by determinants. For CAUCHY see footnote 4, in Sec. 2.5. 
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From this and the initial conditions we obtain, by setting x = 0, 
(a) cy +A =4, (b) cy + 10B = 11, (c) cy — 100A = —299. 


We solve this system for the unknowns A, B, c,. Equation (a) minus Equation (c) gives 101A = 303, A = 3. 
Then c, = | from (a) and B = 1 from (b). The solution is (Fig. 73) 


y = e* + 3cos 10x + sin 10x. 


This gives the solution curve, which oscillates about e” (dashed in Fig. 73). B 


Fig. 73. Solution in Example 2 


Multiple Real Roots 


If a real double root occurs, say, Ay = Ag, then yy = yo in (3), and we take y, and xy, as 
corresponding linearly independent solutions. This is as in Sec. 2.2. 

More generally, if A is a real root of order m, then m corresponding linearly independent 
solutions are 


(7) AX AX ZN ee Me Ae 


We derive these solutions after the next example and indicate how to prove their linear 
independence. 


Real Double and Triple Roots 


Solve the ODE y” — 3y'” + 3y” — y” =0. 


Solution. The characteristic equation d® — 3A + 3a? — A? = Ohas the roots Ay = Ag = 0, and Ag = Aq 
As = 1, and the answer is 


(8) y = cy + cgx + (cg + cax + €5x7)e*. B 


Derivation of (7). We write the left side of (1) as 


(m—-1) Ae wid agy. 


Lly] = y? + an-1y 
Let y = e*”. Then by performing the differentiations we have 


Lle*™] = (A" + any 1 + +++ + age”. 
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Now let A, be a root of mth order of the polynomial on the right, where m S n. Form <n 
let Ay,+1,°°*, An be the other roots, all different from A,. Writing the polynomial in 
product form, we then have 

L{e*”] = (A — Ay) ™h(A)e*” 
with h(A) = 1 if m =n, and A(A) = (A — Apay)***(A — Ay) if m <n. Now comes the 
key idea: We differentiate on both sides with respect to A, 


) = x m 9 x 
(9) ue) = m(A — Ay) Thre + (A — Ar) aA [A(A)e*']. 


The differentiations with respect to x and A are independent and the resulting derivatives 
are continuous, so that we can interchange their order on the left: 


(10) 


L 
Or Or 


fe"|;= | el = tx"), 

The right side of (9) is zero for A = A, because of the factors A — Ay (and m = 2 since 
we have a multiple root!). Hence ire") =0 by (9) and (10). This proves that xe” is 
a solution of (1). 

We can repeat this step and produce x -,x™—1eh” by another m — 2 such 
differentiations with respect to A. Going one step further would no longer give zero on the 
right because the lowest power of A — A, would then be (A — a’ multiplied by m!h(A) 
and h(A;) # 0 because (A) has no factors A — Az; so we get precisely the solutions in (7). 

We finally show that the solutions (7) are linearly independent. For a specific n this 
can be seen by calculating their Wronskian, which turns out to be nonzero. For arbitrary 
m we can pull out the exponential functions from the Wronskian. This gives (e*”)’” = e?”"” 
times a determinant which by “row operations” can be reduced to the Wronskian of 1, 
X,0t*, x™-1. The latter is constant and different from zero (equal to 1!2!---(m — 1)!). 
These functions are solutions of the ODE Sia = 0, so that linear independence follows 
from Theroem 3 in Sec. 3.1. 


2A 
e we. 


Multiple Complex Roots 


In this case, real solutions are obtained as for complex simple roots above. Consequently, 
if A = y + iw is a complex double root, so is the conjugate A = y — iw. Corresponding 
linearly independent solutions are 


(11) e”” cos wx, e’” sinwx, xe’ cos ax, xe” sin wx. 


The first two of these result from e*” and e*” as before, and the second two from xe*” 
and xe*” in the same fashion. Obviously, the corresponding general solution is 


(12) y = e”[(Ay + Aox) cos wx + (By + Box) sin wx]. 


For complex triple roots (which hardly ever occur in applications), one would obtain 
two more solutions x2e”" cos wx, x2e” sin wx, and so on. 
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PROBLEM SET 3.2 


1-6| GENERAL SOLUTION 
Solve the given ODE. Show the details of your work. 
1. yy” + 25y’ =0 
y+ 2y"+y=0 
y’ + 4y" =0 
(D? — D?-D+1)y=0 
(D* + 10D? + 9Dy =0 
(D® + 8D? + 16D)y = 0 


Aw Rw 


7-13| INITIAL VALUE PROBLEM 
Solve the IVP by a CAS, giving a general solution and the 
particular solution and its graph. 
Ty" +3.2y"+48ly’=0, y(0)=3.4, y'(0) = —-4.6, 
y"(0) = 9.91 
8. y"” + 7.5y" + 14.25y’ — 9.125y = 0, (0) = 10.05, 
y'(0) = —54.975,  y"(0) = 257.5125 
9, 4y"” + 8y” + 41y’ + 37y =0, y(0) = 9, 
y'(0) = -6.5, y"(0) = —39.75 
10. y'¥ + 4y=0, y(0)=3, yO) = 3, 
m rae 7 
y (0) = -3 
11. y'’ — 9y” — 400y = 0, y(0)=0, yO) =0, 
y"(0) = 41, y"() =0 
12. y¥ — Sy” + 4y' =0, yO) =3, y'(0) = -5, 
y"(0) = 11, y'"(0) = —23,  y'%(0) = 47 


LB. y+ 0.45y"" — 0.165y” + 0.0045y’ — 0.00175y = 0, 
y(0) = 17.4, yO) = —2.82, y"(O) = 2.0485, 
y” (0) = —1.458675 


14. PROJECT. Reduction of Order. This is of practical 


interest since a single solution of an ODE can often be 
guessed. For second order, see Example 7 in Sec. 2.1. 


(a) How could you reduce the order of a linear 
constant-coefficient ODE if a solution is known? 
(b) Extend the method to a variable-coefficient ODE 


m 


yl" + polx)y” + pidy’ + po(xy = 0. 


Assuming a solution y; to be known, show that another 
solution is yo(x) = u(x)yy(x) with ux) = f(x) dx and 
z obtained by solving 


yiz” + By1 + payiz’ + Byi + 2peyi + pryvz = 0. 


(c) Reduce 


x8y"" 3x2y" + (6 x? )xy’ (6 x)y = 0, 


using y; = x (perhaps obtainable by inspection). 


15. CAS EXPERIMENT. Reduction of Order. Starting 
with a basis, find third-order linear ODEs with variable 
coefficients for which the reduction to second order 
turns out to be relatively simple. 


3.3. Nonhomogeneous Linear ODEs 


We now turn from homogeneous to nonhomogeneous linear ODEs of nth order. We write 


them in standard form 


(1) eee ee 


Mm) _ 


+ pr(x)y’ + po(xdy = r(x) 


with y"” = d"y/dx” as the first term, and r(x) # 0. As for second-order ODEs, a general 
solution of (1) on an open interval J of the x-axis is of the form 


(2) V(X) = Yr) + Yp(). 


Here y,(x) = cyyy(X) + +++ + Cyyn(x) is a general solution of the corresponding 


homogeneous ODE 


(3) ¥™ + Py—ry™"Y + +++ + pry’ + poy = 0 

on J. Also, yp is any solution of (1) on J containing no arbitrary constants. If (1) has 
continuous coefficients and a continuous r(x) on J, then a general solution of (1) exists 
and includes all solutions. Thus (1) has no singular solutions. 


SEC. 3.3 Nonhomogeneous Linear ODEs 117 


EXAMPLE 1 


An initial value problem for (1) consists of (1) and n initial conditions 
(4) yo) = Ko, y'@o)=Ki, 2,  ¥"M(X0) = Kn-1 


with xg in J. Under those continuity assumptions it has a unique solution. The ideas of 
proof are the same as those for n = 2 in Sec. 2.7. 


Method of Undetermined Coefficients 


Equation (2) shows that for solving (1) we have to determine a particular solution of (1). 
For a constant-coefficient equation 


(5) ae) Ga eee coy) 


(dg, *** , @,—1 Constant) and special r(x) as in Sec. 2.7, such a y,(x) can be determined by 
the method of undetermined coefficients, as in Sec. 2.7, using the following rules. 


(A) Basic Rule as in Sec. 2.7. 


(B) Modification Rule. /f a term in your choice for yp(x) is a solution of the 
homogeneous equation (3), then multiply this term by x, where k is the smallest 
positive integer such that this term times x* is not a solution of (3). 


(C) Sum Rule as in Sec. 2.7. 


The practical application of the method is the same as that in Sec. 2.7. It suffices to 
illustrate the typical steps of solving an initial value problem and, in particular, the new 
Modification Rule, which includes the old Modification Rule as a particular case (with 
k = | or 2). We shall see that the technicalities are the same as for n = 2, except perhaps 
for the more involved determination of the constants. 


Initial Value Problem. Modification Rule 


Solve the initial value problem 


(6) yn" + 3y" + 3y’ + y = 30e”, y(0) = 3, y’(0) = 3, y"(0) = —47. 


Solution. Step 1. The characteristic equation is A? + 3A? + 3A + 1 = (A + 1)? = 0. It has the triple root 
A = —1. Hence a general solution of the homogeneous ODE is 


yn = ce” + cgxe* + cgx2e7* 


= (cy + cox + cgx2)e~*, 


Step 2. If we try yp = Ce~*, we get —C + 3C — 3C + C = 30, which has no solution. Try Cxe~* and 
Cx?e~*. The Modification Rule calls for 


yp Cxte*, 
Then Yp = Cx? — xe", 


Vp = C(6x — 6x” + x3)e~*, 


yp = C(6 — 18x + 9x? — x3)e~*, 
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Substitution of these expressions into (6) and omission of the common factor e~” gives 


C(6 — 18x + 9x? — x3) + 3C(6x — 6x? + x3) + 3C(3x2 — x3) + Cx? = 30. 
The linear, quadratic, and cubic terms drop out, and 6C = 30. Hence C = 5. This gives yp = 5x3e7*. 
Step 3. We now write down y = yp + yp, the general solution of the given ODE. From it we find c; by the 
first initial condition. We insert the value, differentiate, and determine cy from the second initial condition, insert 
the value, and finally determine cg from y” (0) and the third initial condition: 


1 n 1 x 


Y =Yn + Yp = (C1 + Cox 4 c3x2)e* + 5x8e7*, y(0) = cy = 3 


y’ = [-3 + cg + (—cg + 2c3)x + (15 C3)x2 5x3 Je—*, y'(0) 34+ C2 3, c= 0 


y” = [3B + 2cg + (30 — 4c3)x + (—30 4 c3)x t 5x3Je"*, y"(0) = 3 + 2cg = —47, cg = —25. 
Hence the answer to our problem is (Fig. 73) 
y = (3 — 25x7)e~* + 5x30. 


The curve of y begins at (0, 3) with a negative slope, as expected from the initial values, and approaches zero 
as x —> %, The dashed curve in Fig. 74 is yp. 


Fig. 74. y and y, (dashed) in Example 1 


Method of Variation of Parameters 


The method of variation of parameters (see Sec. 2.10) also extends to arbitrary order n. 
It gives a particular solution y, for the nonhomogeneous equation (1) (in standard form 


with y™ as the first term!) by the formula 
1e(X) 
yp) = > YE) [32 Woy Oe 
(7) 
Wy (x) Wr) 
= yx09 | Wo (AGO) Gs ap PPS sp rato | Wo r(x) dx 


on an open interval J on which the coefficients of (1) and r(x) are continuous. In (7) the 
functions yj,-:-, y, form a basis of the homogeneous ODE (3), with Wronskian W, and 
W;(j = 1,---,m) is obtained from W by replacing the jth column of W by the column 


(Oo O -:: O a". Thus, when n = 2, this becomes identical with (2) in Sec. 2.10, 
yi ye 0 ye yr (O 
W=], Is m= | ,| = —Y2, We f =. 
Yi 2 1 yo yy i 


SEC. 3.3 Nonhomogeneous Linear ODEs 119 


EXAMPLE 2 


The proof of (7) uses an extension of the idea of the proof of (2) in Sec. 2.10 and can 
be found in Ref [A11] listed in App. 1. 


Variation of Parameters. Nonhomogeneous Euler—Cauchy Equation 


Solve the nonhomogeneous Euler—Cauchy equation 


x8y" 3x2y" + 6xy’ — 6y x? Inx (x > 0). 
Solution. Step 1. General solution of the homogeneous ODE. Substitution of y = x” and the derivatives 
into the homogeneous ODE and deletion of the factor x”” gives 


1) + 6m — 6 = 0. 


mim — 1)(m — 2) — 3m(m 


The roots are 1, 2, 3 and give as a basis 
y3 = coe 


yi = x, ya: a 


Hence the corresponding general solution of the homogeneous ODE is 
3 


VY, = yx + Cox? + c3x°. 


Step 2. Determinants needed in (7). These are 


W,=|0 2x 3x7| = x4 
L -2 6x 
x 0 x? 
W=/1 0 3x7}= —2x3 
0 1 6x 
x x* 0 
W3=]1 2x O|=x. 
0 2 1 


Step 3. Integration. In (7) we also need the right side r(x) of our ODE in standard form, obtained by division 
of the given equation by the coefficient x? of ys thus, r(x) = (x4 In x)/x? = x In x. In (7) we have the simple 
quotients W,/W = x/2, Wo/W = —1, W3/W = 1/(2x). Hence (7) becomes 


1 
= [Eeineax— 2? [rmxdr tx? [poxmxac 
2 2x. 


Yp 
a (5 Inx =) x? (5 Inx =) t x (x Inx — x). 
24:3 9 2 4 2 
Simplification gives y, = 3x4 (In x — id), Hence the answer is 
Y=Yn + Vp = cx 4 Cox2 + c3x? 4 dx? (in x — i), 


Figure 75 shows y,. Can you explain the shape of this curve? Its behavior near x = 0? The occurrence of a minimum? 
Its rapid increase? Why would the method of undetermined coefficients not have given the solution? ia] 
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oe 


10° x 


—-20- 


Fig. 75. Particular solution y, of the nonhomogeneous 
Euler—Cauchy equation in Example 2 


Application: Elastic Beams 


Whereas second-order ODEs have various applications, of which we have discussed some 
of the more important ones, higher order ODEs have much fewer engineering applications. 
An important fourth-order ODE governs the bending of elastic beams, such as wooden or 
iron girders in a building or a bridge. 

A related application of vibration of beams does not fit in here since it leads to PDEs 
and will therefore be discussed in Sec. 12.3. 


Bending of an Elastic Beam under a Load 


We consider a beam B of length L and constant (e.g., rectangular) cross section and homogeneous elastic 
material (e.g., steel); see Fig. 76. We assume that under its own weight the beam is bent so little that it is 
practically straight. If we apply a load to B in a vertical plane through the axis of symmetry (the x-axis in 
Fig. 76), B is bent. Its axis is curved into the so-called elastic curve C (or deflection curve). It is shown in 
elasticity theory that the bending moment M(x) is proportional to the curvature k(x) of C. We assume the bending 
to be small, so that the deflection y(x) and its derivative y’ (x) (determining the tangent direction of C) are small. 
Then, by calculus, k = y"/(1 + y'2)3/? = y”. Hence 


M(x) = Ely" (x). 
EI is the constant of proportionality. E is Young’s modulus of elasticity of the material of the beam. J is the 
moment of inertia of the cross section about the (horizontal) z-axis in Fig. 76. 


Elasticity theory shows further that M” (x) = f(x), where f(x) is the load per unit length. Together, 


(8) Ely’ = f(x). 


Deformed beam 
under uniform load 
(simply supported) 


Fig. 76. Elastic beam 
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In applications the most important supports and corresponding boundary conditions are as follows and shown 
in Fig. 77. 


(A) Simply supported y=y" =Oatx =OandL 
(B) Clamped at both ends y=y' =Oatx=OandL 


(C) Clamped at x = 0, freeatx =L = (0) = y'(0) = 0, y"(D = y"(L) = 0. 


The boundary condition y = 0 means no displacement at that point, y’ = 0 means a horizontal tangent, y” = 0 


means no bending moment, and y” = 0 means no shear force. 
Let us apply this to the uniformly loaded simply supported beam in Fig. 76. The load is f(x) = fo = const. 
Then (8) is 
fo 
9 p= % k= —. 
(9) y EI 


This can be solved simply by calculus. Two integrations give 


k 
y" = rel + Cx + C9. 


y"(0) = 0 gives cg = 0. Then y”(L) L(&KL + Cy) = 0, cy kL/2 (since L # 0). Hence 


Integrating this twice, we obtain 


with cq = 0 from y(0) = 0. Then 


O (Ee ae ) 3 iT 
y on 6° Sy 


Inserting the expression for k, we obtain as our solution 


Jo 
24EI 


(x* — 2Lx? + 13x). 


y 


Since the boundary conditions at both ends are the same, we expect the deflection y(x) to be “symmetric” with 
respect to L/2, that is, y(x) = y(L — x). Verify this directly or set x = u + L/2 and show that y becomes an 


even function of u, 
fo (w 1 ”) ( ai 
L Le) 
my ae” tae | 


From this we can see that the maximum deflection in the middle at wu = 0(x = L/2) is Sfol4/ (16 - 24EFI). Recall 
that the positive direction points downward. 


ae 
(A) Simply supported 
x=0 x=L 
———— (B) Clamped at both 
ends 
x=0 See 
7 (C) Clamped at the left 
end, free at the 
x=0 x=L right end 


Fig. 77. Supports of a beam 
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PROBLEM SET 3.3 


1-7 


GENERAL SOLUTION 


Solve the following ODEs, showing the details of your 
work. 


1. y 


6. 


7 


woe wh 


m 


+ 3y" + 3y’ ty =e™-x-1 
2y" — y' — 2y=1- 4x3 

(D* + 10D? + 9/)y = 6.5 sinh 2x 

3D? — 5D — 391)y = —300 cos x 
. (x8p3 + x2D? — 2xD + 2Dy = x7? 
(D? + 4D)y = sin x 
. (D? — 9D? + 27D 


271)y = 27 sin 3x 


8-13 


INITIAL VALUE PROBLEM 


Solve the given IVP, showing the details of your work. 


8 


9 


10 


11 


12. 


. yl¥ — 5y” + 4y = 10e73”, (0) = 1, y'(0) = 0, 
y"(0) = 0, y"(0) = 0 

. yi’ + 5y” + 4y = 90sin 4x, y(0) = 1, y’() = 2, 
y"(0) = -1, y"" (0) = —32 


xy" + xy’ —y =x, yl) = 1, yA) = 3, 
y"(Q) = 14 

. (D? — 2D? — 3D)yy = 74e7®" sinx, (0) = —1.4, 
y'(0) = 3.2, y"(0) = —5.2 

. (D? — 2D? — 9D + 18Dy = e", (0) = 4.5, 
y'(0) = 8.8, y"(0) = 17.2 


13. 


14. 


15. 


(D? — 4D)y = 10cosx + Ssinx, y(0) = 3, 
y'(0) = —2, y"(0) = -1 


CAS EXPERIMENT. Undetermined Coefficients. 
Since variation of parameters is generally complicated, 
it seems worthwhile to try to extend the other method. 
Find out experimentally for what ODEs this is possible 
and for what not. Hint: Work backward, solving ODEs 
with a CAS and then looking whether the solution 
could be obtained by undetermined coefficients. For 
example, consider 


m 3y" 3y’ y= x /2 Qt 
and 
x3y"" + x?y" — Ixy’ + 2y = xP Ink. 
WRITING REPORT. Comparison of Methods. Write 


a report on the method of undetermined coefficients and 
the method of variation of parameters, discussing and 
comparing the advantages and disadvantages of each 
method. Illustrate your findings with typical examples. 
Try to show that the method of undetermined coefficients, 
say, for a third-order ODE with constant coefficients and 
an exponential function on the right, can be derived from 
the method of variation of parameters. 


CHAP TER-3 REVIEW QUESTIONS AND PROBLEMS 


1 


2. 


5 


. What is the superposition or linearity principle? For 

what nth-order ODEs does it hold? 

List some other basic theorems that extend from 

second-order to nth-order ODEs. 

. If you know a general solution of a homogeneous linear 
ODE, what do you need to obtain from it a general 
solution of a corresponding nonhomogeneous linear 
ODE? 

. What form does an initial value problem for an nth- 
order linear ODE have? 

. What is the Wronskian? What is it used for? 


GENERAL SOLUTION 


So 


Ive the given ODE. Show the details of your work. 


6. yl’ — 3y” — 4y =0 


7 yn" 
8. 


+ 4y” + 13y’ =0 
i Ay" y’ | dy = 3002" 


9. (D* — 16I)y = —15 cosh x 
10. xy!" ae 3xy" _ 2y’ =0 


11. 
12. 
13. 
14. 
15. 


yl” + 4.5y" + 6.75y’ + 3.375y = 0 
(D® — D)y = sinh 0.8x 

(D® + 6D? + 12D + 81)y = 8x? 
(D* — 13D? + 36/)y = 12e” 
4xy” + 3xy' — 3y = 10 


16-20} INITIAL VALUE PROBLEM 


Solve the IVP. Show the details of your work. 


16. 


17. 


18. 


19. 


20. 


(D? — D? —D + Dy =0, y(0) = 0, Dy(0) = 1, 
D?y(0) = 0 
yl” + 5y” + 24y’ + 20y =x, y(0) = 1.94, 


y'(0) = —3.95, y” = —24 

(D* — 26D? + 251Dy = 50(x + 1)?, y(0) = 12.16, 
Dy(0) = —6, D?y(0) = 34, D?y(0) = —130 

(D? + 9D? + 23D + 15I)y = 12exp(—4x), 

y(0) = 9, Dy(0) = —41, Dy(0) = 189 

(D? + 3D? + 3D + Dy =8sinx, yO) = —1, 
y'(0) = —3, y"(0) =5 
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SUMMARY—OF CHAPTER 3 


Higher Order Linear ODEs 


Compare with the similar Summary of Chap. 2 (the case n = 2). 
Chapter 3 extends Chap. 2 from order n = 2 to arbitrary order n. An nth-order 
linear ODE is an ODE that can be written 


()) y+ Pye"? + +++ + prCdy” + poy = r@) 
with y = d"y/dx” as the first term; we again call this the standard form. Equation 


(1) is called homogeneous if r(x) = 0 on a given open interval J considered, 
nonhomogeneous if r(x) # 0 on J. For the homogeneous ODE 


(2) y™ + py-rcdy"? + +++ + pi@dy" + poy = 0 


the superposition principle (Sec. 3.1) holds, just as in the case n = 2. A basis or 
fundamental system of solutions of (2) on / consists of n linearly independent 


solutions yj, °*-, ¥;, of (2) on /. A general solution of (2) on /is a linear combination 
of these, 
(3) y=cyyy t+ ++ + enyn (Cy,°**, Cy arbitrary constants). 


A general solution of the nonhomogeneous ODE (1) on / is of the form 


(4) y¥=Yn + Yp (Sec. 3.3). 


Here, y, is a particular solution of (1) and is obtained by two methods (undetermined 
coefficients or variation of parameters) explained in Sec. 3.3. 

An initial value problem for (1) or (2) consists of one of these ODEs and n 
initial conditions (Secs. 3.1, 3.3) 


(5) — -y@o) = Ko, Qo = Ki, ts VM) = Kn-1 
with given x9 in / and given Ko,---, Ky—1. If po,-**, Py—1, 7 are continuous on J, 


then general solutions of (1) and (2) on J exist, and initial value problems (1), (5) 
or (2), (5) have a unique solution. 


CHAPTER 4 


Systems of ODEs. Phase Plane. 
Qualitative Methods 


Tying in with Chap. 3, we present another method of solving higher order ODEs in 
Sec. 4.1. This converts any nth-order ODE into a system of n first-order ODEs. We also 
show some applications. Moreover, in the same section we solve systems of first-order 
ODEs that occur directly in applications, that is, not derived from an nth-order ODE but 
dictated by the application such as two tanks in mixing problems and two circuits in 
electrical networks. (The elementary aspects of vectors and matrices needed in this chapter 
are reviewed in Sec. 4.0 and are probably familiar to most students.) 

In Sec. 4.3 we introduce a totally different way of looking at systems of ODEs. The 
method consists of examining the general behavior of whole families of solutions of ODEs 
in the phase plane, and aptly is called the phase plane method. It gives information on the 
stability of solutions. (Stability of a physical system is desirable and means roughly that a 
small change at some instant causes only a small change in the behavior of the system at 
later times.) This approach to systems of ODEs is a qualitative method because it depends 
only on the nature of the ODEs and does not require the actual solutions. This can be very 
useful because it is often difficult or even impossible to solve systems of ODEs. In contrast, 
the approach of actually solving a system is known as a quantitative method. 

The phase plane method has many applications in control theory, circuit theory, 
population dynamics and so on. Its use in linear systems is discussed in Secs. 4.3, 4.4, 
and 4.6 and its even more important use in nonlinear systems is discussed in Sec. 4.5 with 
applications to the pendulum equation and the Lokta—Volterra population model. The 
chapter closes with a discussion of nonhomogeneous linear systems of ODEs. 


NOTATION. We continue to denote unknown functions by y; thus, y,(‘), yo()}— 
analogous to Chaps. 1-3. (Note that some authors use x for functions, x1(4), x2(t) when 
dealing with systems of ODEs.) 


Prerequisite: Chap. 2. 
References and Answers to Problems: App. | Part A, and App. 2. 


4.0 For Reference: 
Basics of Matrices and Vectors 


For clarity and simplicity of notation, we use matrices and vectors in our discussion 
of linear systems of ODEs. We need only a few elementary facts (and not the bulk of 
the material of Chaps. 7 and 8). Most students will very likely be already familiar 
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with these facts. Thus this section is for reference only. Begin with Sec. 4.1 and consult 
4.0 as needed. 
Most of our linear systems will consist of two linear ODEs in two unknown functions 


yi), ye(d), 


Yt = 41191 + A122, yt = —5y1 + 2yo 
(1) for example, 


2 = diy + depye, yo = 13y1 + aye 


(perhaps with additional given functions g1(f), g(t) on the right in the two ODEs). 
Similarly, a linear system of n first-order ODEs in n unknown functions yy(t),° ++, yn(4) 
is of the form 


Yi = aiy1 + aiaye + +++ + ainyn 

Ya = doy + degyg + +++ + danYn 
(2) 

Yn = Gniy1 + Gn2y2 + + AnnYn 


(perhaps with an additional given function on the right in each ODE). 


Some Definitions and Terms 


Matrices. In (1) the (constant or variable) coefficients form a 2 X 2 matrix A, that is, 
an array 


a1 412 =5: 2 
(3) A = [ax] = ‘ for example, A= 
a2, dao 13 


Nie 


Similarly, the coefficients in (2) form ann X n matrix 


a1 “42 °° ain 

a2, dog *"" dan 
(4) A = [aj] = 

ani an2 aan ann 


The ay4, d49, ° ++ are called entries, the horizontal lines rows, and the vertical lines columns. 
Thus, in (3) the first row is [a,3 — @y2], the second row is [dz1 ~— dg], and the first and 
second columns are 


a1 a2 
and 
a21 a22, 


In the “double subscript notation” for entries, the first subscript denotes the row and the 
second the column in which the entry stands. Similarly in (4). The main diagonal is the 
diagonal a31 deg °*** dyn in (4), hence ay, dg in (3). 
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We shall need only square matrices, that is, matrices with the same number of rows 
and columns, as in (3) and (4). 


Vectors. A column vector x with n components x1,---,x,, is of the form 
X41 
a) X71 
x=| |, thus if n = 2, x= . 
: x2 
Xn 


Similarly, a row vector v is of the form 


v=[U1 <:: UVyl, thus if n = 2, then V=[v1 Dol. 


Calculations with Matrices and Vectors 


Equality. Twon X n matrices are equal if and only if corresponding entries are equal. 
Thus for n = 2, let 


a1 42 byt Dy 
A= and B= : 
agi deg bo, beg 
Then A = B if and only if 
a1 = by, a2 = by 
dg1 = boy, daz = bop. 


Two column vectors (or two row vectors) are equal if and only if they both have n 
components and corresponding components are equal. Thus, let 


U1 X41 Ui ~ X1 
v= and x= F Then v =x _ if and only if 
v2 x2 


Addition is performed by adding corresponding entries (or components); here, matrices 
must both be n X n, and vectors must both have the same number of components. Thus 
for n = 2, 


ayy + by a2 + bye Vy t+ xy 


(5) A+B= cE vt+x= 


ag + bey dg2 + bes V2 + X2 


Scalar multiplication (multiplication by a number c) is performed by multiplying each 
entry (or component) by c. For example, if 


9 3 


| then -7A = 


—~63 * 


2 § 14 o| 
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0.4 4 
F then 10v = . 
—13 — 130 


Matrix Multiplication. The product C = AB (in this order) of two n X n matrices 
A = [aj] and B = [bj,] is the n X n matrix C = [cj] with entries 


If 


v= 


(6) Cjik = >> Ojmbrk 
m=1 


that is, multiply each entry in the jth row of A by the corresponding entry in the kth column 
of B and then add these n products. One says briefly that this is a “multiplication of rows 
into columns.” For example, 


9 3]f1 -4] [| 9-14+3-2 9+(—4)+3+5 
—2 oj/2 5| [-2-14+0-2 (-2)-(-4+0°5| 
[15 -21 
-2 a} 
CAUTION! Matrix multiplication is not commutative, AB # BA in general. In our 


example, 
i =a |) 2 3) eos been 
2 sj|[-2 o] [2-9+5-(-2  2-3+5-0 


17 3 


8 6] 


Multiplication of an n X n matrix A by a vector x with n components is defined by the 
same rule: v = Ax is the vector with the n components 


n 
v,3=> AjmXm J=lyccyn. 
m=1 
For example, 
12 7 X1 12x4 + 71x92 
—8 3 X2 —8x1 + 3x9 , 


Systems of ODEs as Vector Equations 


Differentiation. The derivative of a matrix (or vector) with variable entries (or 
components) is obtained by differentiating each entry (or component). Thus, if 


Pp wa 
yo = = , then y (t) = = ; 
yo(t) sin t y5(t) cos t 
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Using matrix multiplication and differentiation, we can now write (1) as 


! 


(7) y = » &8, Y= 


13 


a1 a y1 


I 
Nn 
Nie Nw 
—— 
S X< 
N ee 
ee 


421 422} | y2 


Similarly for (2) by means of ann X n matrix A and acolumn vector y with n components, 
namely, y’ = Ay. The vector equation (7) is equivalent to two equations for the 
components, and these are precisely the two ODEs in (1). 


Some Further Operations and Terms 


Transposition is the operation of writing columns as rows and conversely and is indicated 
by T. Thus the transpose A’ of the 2 X 2 matrix 


a1 a42 —5 2 r a41 a21 =5 13 
A= = ; is A = = ut 
az1 22, 130 3 a2 22 2 3 
The transpose of a column vector, say, 
U1 
= : T 
v= : is a row vector, v =[v1 Ug], 
v2 


and conversely. 


Inverse of a Matrix. Then X n unit matrix I is the n X n matrix with main diagonal 
1, 1,---, 1 and all other entries zero. If, for a given n X n matrix A, there is ann Xn 
matrix B such that AB = BA = I, then A is called nonsingular and B is called the inverse 
of A and is denoted by Al; thus 


(8) AAT=ATA=L 


The inverse exists if the determinant det A of A is not zero. 
If A has no inverse, it is called singular. For n = 2, 


41 1 422 442 
(9) ~ det A : 
~ag1 a1 
where the determinant of A is 
41 412 
(10) det A = = 411492 — 449209}. 
a21 422 


(For general n, see Sec. 7.7, but this will not be needed in this chapter.) 
Linear Independence. , given vectors v),---,v with n components are called a 


linearly independent set or, more briefly, linearly independent, if 


(11) ave + +ev=0 
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implies that all scalars cy,---, c, must be zero; here, 0 denotes the zero vector, whose n 
components are all zero. If (11) also holds for scalars not all zero (so that at least one of 
these scalars is not zero), then these vectors are called a linearly dependent set or, briefly, 
linearly dependent, because then at least one of them can be expressed as a linear 
combination of the others; that is, if, for instance, cy # 0 in (11), then we can obtain 


gv = QE ss4 My, 


1 
~e, (CoV + CV 


Eigenvalues, Eigenvectors 


Eigenvalues and eigenvectors will be very important in this chapter (and, as a matter of 
fact, throughout mathematics). 
Let A = [a;,] be ann X n matrix. Consider the equation 


(12) Ax = Ax 


where A is a scalar (a real or complex number) to be determined and x is a vector to be 
determined. Now, for every A, a solution is x = 0. A scalar A such that (12) holds for 
some vector x # 0 is called an eigenvalue of A, and this vector is called an eigenvector 
of A corresponding to this eigenvalue A. 

We can write (12) as Ax — Ax = 0 or 


(13) (A — ADx = 0. 


These are n linear algebraic equations in the m unknowns x4,°--, xX, (the components 
of x). For these equations to have a solution x # 0, the determinant of the coefficient 
matrix A — AI must be zero. This is proved as a basic fact in linear algebra (Theorem 4 
in Sec. 7.7). In this chapter we need this only for n = 2. Then (13) is 


a1 Xr a42 XY 0 
(14) = ; 
d21 d22 — Xr X9 0 
in components, 
(44, — A)xy + ax. = =0 


cl") 
a91X1 ud (doo _ A)x9 = 0. 


Now A — ALis singular if and only if its determinant det (A — AID), called the characteristic 
determinant of A (also for general 7), is zero. This gives 


a4, —A a2 
i (A= a1 = 
21 dog — A 
(15) = (a41 — A)(a22 — A) — ay2G21 


2 
= AX — (ay + dgg)A + a41d22 — ay2Qd21 = 0. 
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This quadratic equation in A is called the characteristic equation of A. Its solutions are 
the eigenvalues A, and Ag of A. First determine these. Then use (14*) with A = A, to 
determine an eigenvector x of A corresponding to A,. Finally use (14*) with A = Ag 
to find an eigenvector x” of A corresponding to Ag. Note that if x is an eigenvector of 
A, so is kx with any k # 0. 


Eigenvalue Problem 


Find the eigenvalues and eigenvectors of the matrix 


(16) A= 
-16 12 


—4.0 "] 


Solution. The characteristic equation is the quadratic equation 


—-4—-2X 4 


dt — a = =)? +28A + 16 =0. 


6 120d 


It has the solutions Ay = —2 and Ag = —0.8. These are the eigenvalues of A. 
Eigenvectors are obtained from (14*). For A = Ay = —2 we have from (14*) 


(-4.0 + 2.0)x,+ 40x. =0 


1.6x4 (1.2 + 2.0)xo = 0. 


A solution of the first equation is x; = 2, xg = 1. This also satisfies the second equation. (Why?) Hence an 
eigenvector of A corresponding to Ay = —2.0 is 


2 1 
(17) x) = : Similarly, x? = 
1 0.8 


is an eigenvector of A corresponding to Ag = —0.8, as obtained from (14*) with A = Ag. Verify this. tel] 


4.1 Systems of ODEs as Models 
in Engineering Applications 


EXAMPLE 1 


We show how systems of ODEs are of practical importance as follows. We first illustrate 
how systems of ODEs can serve as models in various applications. Then we show how a 
higher order ODE (with the highest derivative standing alone on one side) can be reduced 
to a first-order system. 


Mixing Problem Involving Two Tanks 


A mixing problem involving a single tank is modeled by a single ODE, and you may first review the 
corresponding Example 3 in Sec. 1.3 because the principle of modeling will be the same for two tanks. The 
model will be a system of two first-order ODEs. 

Tank 7, and 73 in Fig. 78 contain initially 100 gal of water each. In 7, the water is pure, whereas 150 lb of 
fertilizer are dissolved in 7. By circulating liquid at a rate of 2 gal/min and stirring (to keep the mixture uniform) 
the amounts of fertilizer yj(¢) in J, and yo(t) in 74 change with time t. How long should we let the liquid circulate 
so that 7 will contain at least half as much fertilizer as there will be left in 7%? 
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y(t) 
150 
y(t) 
2 gal/min f 100/- : 
| 
<a 75L- = = 
| 
2 gal/min | 50 y(t) 
Cz _]| | 
} I 
\ | 
: 0 | | 
System of tanks (6) 27.5 50 100 t 


Fig. 78. Fertilizer content in Tanks T, (lower curve) and T, 


Solution. Step 1. Setting up the model. As for a single tank, the time rate of change y}(t) of y;(t) equals 
inflow minus outflow. Similarly for tank 74. From Fig. 78 we see that 


4 
c= Inflow/min — Outflow/min = y y Tank 7, 
J / / 100°2 1 J ( 1) 
ya = Infl /mi — Outflow/mi = Se ee (Tank 7) 
nilow/min utTLO min y ¥ an. . 
J2 I x1 1 y2 2 


Hence the mathematical model of our mixture problem is the system of first-order ODEs 


yt = —0.02y, + 0.02y2 (Tank 7%) 
yo = 0.02y1 — 0.02y2 (Tank 73). 
y1 
As a vector equation with column vector y = | | and matrix A this becomes 
y2 
—0.02 0.02 
y’ = Ay, where A= : 
0.02 —0.02 


Step 2. General solution. As for a single equation, we try an exponential function of ¢, 
(1) y=xe = Then iy’ = Axe’ = Axe*®, 


Dividing the last equation Axe** = Axe** by e** and interchanging the left and right sides, we obtain 


We need nontrivial solutions (solutions that are not identically zero). Hence we have to look for eigenvalues 
and eigenvectors of A. The eigenvalues are the solutions of the characteristic equation 


—0.02-A 0.02 
(2) det (A — AI) = (—0.02 — A)? — 0.02? = A(A + 0.04) = 0. 
0.02 —0.02 -A 


We see that A, = 0 (which can very well happen—don’t get mixed up—it is eigenvectors that must not be zero) 
and Ag = —0.04. Eigenvectors are obtained from (14*) in Sec. 4.0 with A = 0 and A = —0.04. For our present 
A this gives [we need only the first equation in (14*)] 


—0.02x, + 0.02x2 = 0 and (—0.02 + 0.04)x1 + 0.02x2 = 0, 
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respectively. Hence x1 = xg and x1 = —Xg, respectively, and we can take x, = x2 = | and xy = —xg = I. 
This gives two eigenvectors corresponding to Ay = 0 and Ag = —0.04, respectively, namely, 


From (1) and the superposition principle (which continues to hold for systems of homogeneous linear ODEs) 
we thus obtain a solution 


(3) y= Cx Pett + cox edt = oy 


where cj, and cz are arbitrary constants. Later we shall call this a general solution. 


Step 3. Use of initial conditions. The initial conditions are y;(0) = 0 (no fertilizer in tank 7,) and yo(0) = 150. 
From this and (3) with t = 0 we obtain 


1 1 cy + C2 0 
y(O) = cy + Co = 4 
1 a | cy — Cg 150 
In components this is cy + Cg = 0,cy — cg = 150. The solution is cy = 75, co = —75. This gives the answer 
1 1 
y = 75xP — 75x@e-0.04t = 75 75 7 0.04t. 
1 =1 
In components, 
yy = 75 - T5e 0.04 (Tank T,, lower curve) 
yg = 75+ 756° 0 (Tank 75, upper curve). 


Figure 78 shows the exponential increase of y; and the exponential decrease of yg to the common limit 75 |b. 
Did you expect this for physical reasons? Can you physically explain why the curves look “symmetric”? Would 
the limit change if 7, initially contained 100 lb of fertilizer and % contained 50 1b? 


Step 4. Answer. 7, contains half the fertilizer amount of 7) if it contains 1/3 of the total amount, that is, 
50 Ib. Thus 


y, = 75 — 75e~°4# = 50, e 04 = 5, t = (In 3)/0.04 = 27.5. 
Hence the fluid should circulate for at least about half an hour. 


Electrical Network 


Find the currents /;(t) and /s(t) in the network in Fig. 79. Assume all currents and charges to be zero at t = 0, 
the instant when the switch is closed. 


L=l1henry C=0.25 farad 


R, = 6 ohms 


Fig. 79. Electrical network in Example 2 


Solution. Step 1. Setting up the mathematical model. The model of this network is obtained from 
Kirchhoff’s Voltage Law, as in Sec. 2.9 (where we considered single circuits). Let /,(f) and /y(t) be the currents 
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in the left and right loops, respectively. In the left loop, the voltage drops are LI, = I; [V] over the inductor 
and Ry, — Ig) = 4, — Ig) [V] over the resistor, the difference because /, and Jy flow through the resistor in 
opposite directions. By Kirchhoff’s Voltage Law the sum of these drops equals the voltage of the battery; that 
is, [1 + 4(, — Ip) = 12, hence 


(4a) I, = —4, + 4b + 12. 


In the right loop, the voltage drops are Rola = 6/2 [V] and Ry(lg — 11) = 42g — 1y) [V] over the resistors and 
d/C)f In dt = 4f Iz dt [V] over the capacitor, and their sum is zero, 


6lz + 4g — )) 4 4| teat 0 or 101, ~ 4h, + 4] fadr = 0, 


Division by 10 and differentiation gives Ib — 0.41, + 041g = 0. 
To simplify the solution process, we first get rid of 0.4/{, which by (4a) equals 0.4(—4/, + 4/5 + 12). 
Substitution into the present ODE gives 


Ib = 041, — 04lg = 0.4(—41, + 41g + 12) — 0.45 
and by simplification 
(4b) Ih = -1.6l, + 1.2lg + 4.8. 
In matrix form, (4) is (we write J since I is the unit matrix) 


wT —40 40 12.0 
(5) J’ =AJ +a, where J= , A= , £= ' 
In =1,6. 1:2 4.8 


Step 2. Solving (5). Because of the vector g this is a nonhomogeneous system, and we try to proceed as for a 
single ODE, solving first the homogeneous system J’ = AJ (thus J’ — AJ = 0) by substituting J = xe**. This 
gives 


J’ = Axe*™ = Axe’, hence Ax = Ax. 


Hence, to obtain a nontrivial solution, we again need the eigenvalues and eigenvectors. For the present matrix 
A they are derived in Example | in Sec. 4.0: 


Hence a “general solution” of the homogeneous system is 


Jn = cx Pent + cx e708, 


For a particular solution of the nonhomogeneous system (5), since g is constant, we try a constant column 
vector J, = a with components a}, dg. Then D5 = 0, and substitution into (5) gives Aa + g = 0; in components, 


4.0a, + 4.0a2 + 12.0 = 0 


1.6a, + 1.2a9 4.8 = 0. 


The solution is a, = 3, ag = 0; thus a = . Hence 


(6) J=Jn+ Jp = cx Pet + Coxe 08t + a; 
in components, 
Ty = 2cye"*£ + cn 986 4. 3 


In = cye + 0.8c9e~ 08". 
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The initial conditions give 


1,(0) = 2c3 + co +3 =0 

In(0) = cy + 0.8c9 =0. 
Hence cy = —4 and cg = 5. As the solution of our problem we thus obtain 
(7) J = —4x Pen 7t 4+ 5x e-9-8t 4 a 
In components (Fig. 80b), 

i Be 7 + 5079 8t 4 3 

In = —4e778 + 4e70 8, 


Now comes an important idea, on which we shall elaborate further, beginning in Sec. 4.3. Figure 80a shows 
1,(t) and J9(t) as two separate curves. Figure 80b shows these two currents as a single curve [/;(¢), /o(¢)] in the 
I,Ig-plane. This is a parametric representation with time f¢ as the parameter. It is often important to know in 
which sense such a curve is traced. This can be indicated by an arrow in the sense of increasing f, as is shown. 
The J;/-plane is called the phase plane of our system (5), and the curve in Fig. 80b is called a trajectory. We 
shall see that such “phase plane representations” are far more important than graphs as in Fig. 80a because 
they will give a much better qualitative overall impression of the general behavior of whole families of solutions, 
not merely of one solution as in the present case. | 


I, 

LS 
1 | 

0.5- 
6) L | L 

t ) 1 2 3 4 5 I, 
(a) Currents I, (b) Trajectory [Z,(t), Lor 
(upper curve) in the I,J,-plane 
and I, (the “phase plane”) 


Fig. 80. Currents in Example 2 


Remark. In both examples, by growing the dimension of the problem (from one tank to 
two tanks or one circuit to two circuits) we also increased the number of ODEs (from one 
ODE to two ODEs). This “growth” in the problem being reflected by an “increase” in the 
mathematical model is attractive and affirms the quality of our mathematical modeling and 
theory. 


Conversion of an nth-Order ODE to a System 


We show that an nth-order ODE of the general form (8) (see Theorem 1) can be converted 
to a system of n first-order ODEs. This is practically and theoretically important— 
practically because it permits the study and solution of single ODEs by methods for 
systems, and theoretically because it opens a way of including the theory of higher order 
ODEs into that of first-order systems. This conversion is another reason for the importance 
of systems, in addition to their use as models in various basic applications. The idea of 
the conversion is simple and straightforward, as follows. 
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THEOREM =—1 


PROOF 


EXAMPLE 3 


Conversion of an ODE 
An nth-order ODE 


(8) yO Shey yo 


can be converted to a system of n first-order ODEs by setting 
(9) W=y w=, w= HV. 


This system is of the form 


il > 
ye) = 8} 
(10) 
Yn-1 = Yn 


Yn = F(t, Y1, Ya."**» Yn): 


The first n — | of these n ODEs follows immediately from (9) by differentiation. Also, 
Th = ike by (9), so that the last equation in (10) results from the given ODE (8). (t| 


Mass on a Spring 


To gain confidence in the conversion method, let us apply it to an old friend of ours, modeling the free motions 
of a mass on a spring (see Sec. 2.4) 


c k 
my" + cy’ + ky =0 or y" =—-—y'-—y. 
= cs m~ m 
For this ODE (8) the system (10) is linear and homogeneous, 
Yir= 22 
. k c 
y2~ 7, Yl yy, 92 
V1 
Setting y = , we get in matrix form 
y2 
; Fi 
JL 
yray=| , «¢ | | 
Tm m |W 
The characteristic equation is 
—Xr 1 
9 ce k 
det (A — AD = =) +—A+—=0. 
k Cc m m 
SS. = SK 
m m 
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It agrees with that in Sec. 2.4. For an illustrative computation, let m = 1,c = 2, and k = 0.75. Then 


A? + 20 + 0.75 = (A + 0.5)(A + 1.5) = 0. 
This gives the eigenvalues Ay = —0.5 and Ay = —1.5. Eigenvectors follow from the first equation in A — AI = 0, 
which is —Ax, + xg = 0. For A, this gives 0.5x1 + xg = 0, say, x1 = 2, x2 = —1. For Ag = —1.5 it gives 
1.5x1 + xg = 0, say, xy = 1, x2 = —1.5. These eigenvectors 

2 1 2 1 
xD = , x= give y=c¢ eo O5t + oy eT hat. 
-1 =15 = =15 
This vector solution has the first component 
y = yy = Beye! + eye 15t 

which is the expected solution. The second component is its derivative 

yo y’ cye 9! = 1 Scge7 ht, | 


PROBLEM SET 4-1 


1-6 


MIXING PROBLEMS 


1, 


6. 


Find out, without calculation, whether doubling the 
flow rate in Example 1 has the same effect as halfing 
the tank sizes. (Give a reason.) 


. What happens in Example 1 if we replace 7, by a tank 


containing 200 gal of water and 150 lb of fertilizer 
dissolved in it? 


. Derive the eigenvectors in Example 1 without consulting 


this book. 


. In Example 1 find a “general solution” for any ratio 


a = (flow rate)/(tank size), tank sizes being equal. 
Comment on the result. 


. If you extend Example | by a tank 73 of the same size 


as the others and connected to 7% by two tubes with 
flow rates as between 7; and 75, what system of ODEs 
will you get? 


Find a “general solution” of the system in Prob. 5. 


ELECTRICAL NETWORK 


In Example 2 find the currents: 


7. 


9. 


If the initial currents are 0 A and —3 A (minus meaning 
that /5(0) flows against the direction of the arrow). 


. If the capacitance is changed to C = 5/27 F. (General 


solution only.) 


If the initial currents in Example 2 are 28 A and 14 A. 


10-13 


CONVERSION TO SYSTEMS 


Find a general solution of the given ODE (a) by first converting 


it to a system, (b), as given. Show the details of your work. 
10. y” + 3y’ + 2y =0 11. 4y” — 15y’ — 4y =0 
12. y"” + 2y” —y'’ — 2y=0 

13. y"” + 2y’ — 24y =0 


14. TEAM PROJECT. Two Masses on Springs. (a) Set 


15. 


up the model for the (undamped) system in Fig. 81. 
(b) Solve the system of ODEs obtained. Hint. Try 
y = xe" and set w” = A. Proceed as in Example 1 or 
2. (c) Describe the influence of initial conditions on the 
possible kind of motions. 


ky =3 
(y, = 0) m,=1 5 
nal 
ka=2 (Net change in 
spring length 
(¥p= 0) m,=1 =Vo- I) 
Yo 
V2 
System in : 
static System in 
equilibrium motion 
Fig. 81. Mechanical system in Team Project 


CAS EXPERIMENT. Electrical Network. (a) In 
Example 2 choose a sequence of values of C that 
increases beyond bound, and compare the corresponding 
sequences of eigenvalues of A. What limits of these 
sequences do your numeric values (approximately) 
suggest? 

(b) Find these limits analytically. 

(c) Explain your result physically. 

(d) Below what value (approximately) must you decrease 
C to get vibrations? 
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4.2 Basic Theory of Systems of ODEs. 
Wronskian 


THEOREM 1 


In this section we discuss some basic concepts and facts about system of ODEs that are 
quite similar to those for single ODEs. 
The first-order systems in the last section were special cases of the more general system 


yi = ikl Mila? 5 Sep) 
yo = IPAs 52? 5 Ma) 
(1) 


Yn = jill Min??? o Mak 


We can write the system (1) as a vector equation by introducing the column vectors 
y=([y-"° o andf=[fi °°: Pall (where T means transposition and saves us 
the space that would be needed for writing y and f as columns). This gives 


(1) y =f(y). 
This system (1) includes almost all cases of practical interest. For n = 1 it becomes 


y', = fit, yx) or, simply, y’ = f(t, y), well known to us from Chap. 1. 
A solution of (1) on some interval a < t < bis a set of n differentiable functions 


yr = hi), +s Yn = hn 
on a <t<b that satisfy (1) throughout this interval. In vector from, introducing the 
“solution vector” h = [hy -:- hy)" (a column vector!) we can write 
y = h(). 


An initial value problem for (1) consists of (1) and n given initial conditions 


(2) yito) = K4, ya(to) = Ko, ae Yn(to) = Kn, 


in vector form, y(to9) = K, where fo is a specified value of ¢ in the interval considered and 
the components of K = [Ky --- eae are given numbers. Sufficient conditions for the 
existence and uniqueness of a solution of an initial value problem (1), (2) are stated in 
the following theorem, which extends the theorems in Sec. 1.7 for a single equation. (For 
a proof, see Ref. [A7].) 


Existence and Uniqueness Theorem 


Let fi,-++, fn in (1) be continuous functions having continuous partial derivatives 
Of1/dy1,°°*, Of1/OVn.***, Ofn/AVn in some domain R_ of tyyy2°**Yn-space 
containing the point (to, K1,°*+, Ky). Then (1) has a solution on some interval 


to —-a<t<to+ a satisfying (2), and this solution is unique. 
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Linear Systems 


Extending the notion of a linear ODE, we call (1) a linear system if it is linear in 
Y1,°°', Yn} that is, if it can be written 


Yt = Qu(Oy1 + +++ + din(Dyn + gid) 
(3) 


Vin = Ani ()y1 ae eal AnnDyn ar AC). 


As a vector equation this becomes 


(3) y =Ayt+g 

a1 aad ain Y1 §1 
where A= , y= , s- 

Ani ann Yn &n 


This system is called homogeneous if g = 0, so that it is 
(4) y’ = Ay. 


If g # 0, then (3) is called nonhomogeneous. For example, the systems in Examples | and 
3 of Sec. 4.1 are homogeneous. The system in Example 2 of that section is nonhomogeneous. 

For a linear system (3) we have df; /dy1 = a11(t),°**, An /OYn = Gny(t) in Theorem 1. 
Hence for a linear system we simply obtain the following. 


THEOREM 2 Existence and Uniqueness in the Linear Case 


Let the aj;x,’s and g;’s in (3) be continuous functions of t on an open interval 
a <t< B containing the point t = to. Then (3) has a solution y(t) on this interval 
satisfying (2), and this solution is unique. 


As for a single homogeneous linear ODE we have 


THEOREM 3 Superposition Principle or Linearity Principle 


If ye and y are solutions of the homogeneous linear system (4) on some interval, 
so is any linear combination y = cy yo + cy ys. 


PROOF _ Differentiating and using (4), we obtain 


! 


y =Ic1y 
= cy’ + coy 
cyAy? 4: coAy 


A(cyy™ + coy™) = Ay. a 


(65) a cy?) 


(2)! 
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The general theory of linear systems of ODEs is quite similar to that of a single linear 
ODE in Secs. 2.6 and 2.7. To see this, we explain the most basic concepts and facts. For 
proofs we refer to more advanced texts, such as [A7]. 


Basis. General Solution. Wronskian 


By a basis or a fundamental system of solutions of the homogeneous system (4) on some 
interval J we mean a linearly independent set of n solutions yo eae yo" of (4) on that 
interval. (We write J because we need I to denote the unit matrix.) We call a corresponding 
linear combination 


(5) Yay ty (cy,°**, Cy arbitrary) 


a general solution of (4) on J. It can be shown that if the ajz,(¢) in (4) are continuous on 
J, then (4) has a basis of solutions on J, hence a general solution, which includes every 
solution of (4) on J. 


We can write n solutions yo, mis yo of (4) on some interval J as columns of ann X n 
matrix 
1 
(6) Yep 8 ay 
The determinant of Y is called the Wronskian of ee at x, written 
qd) (2) (n) 
Yi Yi ca 
qd) Q) (n) 
qd) (n) y2 v2 wT 2 
(7) Wy ee = 
qd) (2) (n) 
Yn Yn a ve 


The columns are these solutions, each in terms of components. These solutions form a 
basis on J if and only if W is not zero at any f, in this interval. W is either identically 
zero or nowhere zero in J. (This is similar to Secs. 2.6 and 3.1.) 

If the solutions yo, ee sy? in (5) form a basis (a fundamental system), then (6) is 
often called a fundamental matrix. Introducing a column vector ¢ = [cy co °°: cyl" ; 


we can now write (5) simply as 


(8) y = Ye. 


Furthermore, we can relate (7) to Sec. 2.6, as follows. If y and z are solutions of a 
second-order homogeneous linear ODE, their Wronskian is 


z 
Wy, z) = 


le 


y 2 


To write this ODE as a system, we have to set y = yy, y’ = y, = yg and similarly for z 
(see Sec. 4.1). But then W(y, z) becomes (7), except for notation. 
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4.3 Constant-Coefficient Systems. 


Phase Plane Method 


Continuing, we now assume that our homogeneous linear system 
(1) Vy Ay 


under discussion has constant coefficients, so that the n X n matrix A = [a;x] has entries 
not depending on ¢. We want to solve (1). Now a single ODE y’ = ky has the solution 
y= Ce, So let us try 


(2) y = xe", 


Substitution into (1) gives y’ = Axe*’ = Ay = Axe”. Dividing by e*’, we obtain the 
eigenvalue problem 


(3) Ax = Xx. 


Thus the nontrivial solutions of (1) (solutions that are not zero vectors) are of the form 
(2), where A is an eigenvalue of A and x is a corresponding eigenvector. 
We assume that A has a linearly independent set of n eigenvectors. This holds in most 


applications, in particular if A is symmetric (a,j = aj,) or skew-symmetric (ay; = —ajx) 
or has n different eigenvalues. 

Let those eigenvectors be xP \---.x™ and let them correspond to eigenvalues 
A1,°**, An (which may be all different, or some—or even all—may be equal). Then the 


corresponding solutions (2) are 


(4) hee = xe ben, vie = xMernt 


Their Wronskian W = wy, oe y”) [(7) in Sec. 4.2] is given by 


xem? nea x (Pernt xP du Gi 
ss xPemt eh x$Pernt nae Pt ad wdue ated 
= NW) _ = eid n 
W= (y Gy ) ~~ =e" 
xDemt oaks xVornt xe Ae x 


On the right, the exponential function is never zero, and the determinant is not zero either 
because its columns are the n linearly independent eigenvectors. This proves the following 
theorem, whose assumption is true if the matrix A is symmetric or skew-symmetric, or if 
the n eigenvalues of A are all different. 
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THEOREM 1 


EXAMPLE 1 


General Solution 


If the constant matrix A in the system (1) has a linearly independent set of n 
eigenvectors, then the corresponding solutions yo, cae x in (4) form a basis of 
solutions of (1), and the corresponding general solution is 


(QD) Aat 


(5) y = cx Pert + 0. + o,xMer!, 


How to Graph Solutions in the Phase Plane 


We shall now concentrate on systems (1) with constant coefficients consisting of two 
ODEs 


; V1 = ay1y1 + ayo 
(6) y’ = Ay; in components, 


ya = doiy1 + ag2yo. 


Of course, we can graph solutions of (6), 


(7) y(t) = , 


as two curves over the f-axis, one for each component of y(t). (Figure 80a in Sec. 4.1 shows 
an example.) But we can also graph (7) as a single curve in the y; yg-plane. This is a parametric 
representation (parametric equation) with parameter ¢. (See Fig. 80b for an example. Many 
more follow. Parametric equations also occur in calculus.) Such a curve is called a trajectory 
(or sometimes an orbit or path) of (6). The y y2-plane is called the phase plane.’ If we fill 
the phase plane with trajectories of (6), we obtain the so-called phase portrait of (6). 

Studies of solutions in the phase plane have become quite important, along with 
advances in computer graphics, because a phase portrait gives a good general qualitative 
impression of the entire family of solutions. Consider the following example, in which 
we develop such a phase portrait. 


Trajectories in the Phase Plane (Phase Portrait) 


Find and graph solutions of the system. 
In order to see what is going on, let us find and graph solutions of the system 


=3) 1 


(8) y =Ay= 
1 -3 


| y, thus 


1A name that comes from physics, where it is the y-(mv)-plane, used to plot a motion in terms of position y 
and velocity y’ = v (m = mass); but the name is now used quite generally for the y, ys-plane. 

The use of the phase plane is a qualitative method, a method of obtaining general qualitative information 
on solutions without actually solving an ODE or a system. This method was created by HENRI POINCARE 
(1854-1912), a great French mathematician, whose work was also fundamental in complex analysis, divergent 
series, topology, and astronomy. 
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EXAMPLE 1 


CHAP. 4 Systems of ODEs. Phase Plane. Qualitative Methods 


Solution. By substituting y = xe! and y’ = Axe” and dropping the exponential function we get Ax = Ax. 
The characteristic equation is 


=3'= A, 1 
det (A — AI) = =)? + 6A+8=0. 
1 —3-2 
This gives the eigenvalues Ay = —2 and Ag = —4. Eigenvectors are then obtained from 
(-3 — A)\xy + x2 = 0. 
For A, 2 this is —x, + xg = 0. Hence we can take xP = [1 1)". For Ag = —4 this becomes xy + x2 = 0, 
and an eigenvector is x” = [1 —1]'. This gives the general solution 
Y1 1 1 
y cy? + ey® = cy et + oy eat. 
y2 1 =i 


Figure 82 shows a phase portrait of some of the trajectories (to which more trajectories could be added if so 
desired). The two straight trajectories correspond to c, = 0 and cz = 0 and the others to other choices of 
Cy, C2. | 


The method of the phase plane is particularly valuable in the frequent cases when solving 
an ODE or a system is inconvenient of impossible. 


Critical Points of the System (6) 


The point y = 0 in Fig. 82 seems to be a common point of all trajectories, and we want 
to explore the reason for this remarkable observation. The answer will follow by calculus. 
Indeed, from (6) we obtain 


dyz yodt yx  dg1y1 + dogye 
dyy yidt yy ayy1 + ay2yo" 


(9) 


This associates with every point P: (y1, yo) a unique tangent direction dys/dy, of the 
trajectory passing through P, except for the point P = Py: (0, 0), where the right side of (9) 
becomes 0/0. This point Po, at which dy2/dy, becomes undetermined, is called a critical 
point of (6). 


Five Types of Critical Points 


There are five types of critical points depending on the geometric shape of the trajectories 
near them. They are called improper nodes, proper nodes, saddle points, centers, and 
spiral points. We define and illustrate them in Examples 1-5. 


(Continued ) Improper Node (Fig. 82) 


An improper node is a critical point Po at which all the trajectories, except for two of them, have the same 
limiting direction of the tangent. The two exceptional trajectories also have a limiting direction of the tangent 
at Py which, however, is different. 

The system (8) has an improper node at 0, as its phase portrait Fig. 82 shows. The common limiting direction 
at 0 is that of the eigenvector x = [1 1]" because e~* goes to zero faster than e 24 as t increases. The two 
exceptional limiting tangent directions are those of x? = [1 1" and —x® =[-1 1)". | 
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EXAMPLE 2 


EXAMPLE 3 


Proper Node (Fig. 83) 


A proper node is a critical point Py at which every trajectory has a definite limiting direction and for any given 
direction d at Pp there is a trajectory having d as its limiting direction. 
The system 


1 0 v4 > AL 
(10) y = y, thus 
ya = Yo 


has a proper node at the origin (see Fig. 83). Indeed, the matrix is the unit matrix. Its characteristic equation 
(1 — A)? = 0 has the root A = 1. Any x # 0 is an eigenvector, and we can take [1 O}' and [0 1]". Hence 
a general solution is 


t yi Cye 
e or or C1 y2 = C2y4. | 
1 yg = coe! 


yt) 


NX yy 
Xt) 


Fig. 82. Trajectories of the system (8) Fig. 83. Trajectories of the system (10) 
(Improper node) (Proper node) 


nal 


Saddle Point (Fig. 84) 


A saddle point is a critical point Py at which there are two incoming trajectories, two outgoing trajectories, and 
all the other trajectories in a neighborhood of Po bypass Pp. 


The system 
; 1 0 y= V1 
(11) y= y, thus ' 
0 -l yi = ye 
has a saddle point at the origin. Its characteristic equation (1 — A)(—1 — A) = 0 has the roots Ay = | and 
Ag = —1. For A = 1 an eigenvector [1 oy" is obtained from the second row of (A — ADx = 0, that is, 
Ox, + (-1 — 1)xg = 0. For Ag = —1 the first row gives [0 1)". Hence a general solution is 
1 y= cyet 
= t -t = 
y=c e+ c2 e or or yiy2 = const. 
0 1 yo = coe! 


This is a family of hyperbolas (and the coordinate axes); see Fig. 84. 3 
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Center (Fig. 85) 


A center is a critical point that is enclosed by infinitely many closed trajectories. 


The system 
: 0 1 (a) yi = ye 
(12) y = y; thus 
—4 (b) yg= —4y1 
has a center at the origin. The characteristic equation W+4=0 gives the eigenvalues 27 and —2i. For 2i an 
eigenvector follows from the first equation —2ix, + x2 = Oof (A — ADx = 9, say, [1 2". For A = —2i that 
equation is —(—2i)x, + x2 = 0 and gives, say, [1 —2i]". Hence a complex general solution is 


yy cert as coe tt 


et + Co 


1 . 
eT thus se Bs 
—2i yo = 2icye” — 2icge™. 


1 
(12*) yr 
2i 


A real solution is obtained from (12*) by the Euler formula or directly from (12) by a trick. (Remember the 
trick and call it a method when you apply it again.) Namely, the left side of (a) times the right side of (b) is 
—4yyy}. This must equal the left side of (b) times the right side of (a). Thus, 


—4yiy} = yoyo. By integration, 24 + 5 3 = const. 


This is a family of ellipses (see Fig. 85) enclosing the center at the origin. B 


V \ 
ais ae 
Vy na 
h (7 
Fig. 84. Trajectories of the system (11) Fig. 85. Trajectories of the system (12) 


(Saddle point) (Center) 


Spiral Point (Fig. 86) 


A spiral point is a critical point Py about which the trajectories spiral, approaching Pp as t > © (or tracing these 
spirals in the opposite sense, away from Pp). 
The system 


=i 1 yr = yi + yo 
(13) y’ = y, thus : 


Joy 2 


has a spiral point at the origin, as we shall see. The characteristic equation is N+ 2A+2=0.It gives the 
eigenvalues —1 + i and —1 — i. Corresponding eigenvectors are obtained from (—1 — A)xy + xg = 0. For 
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EXAMPLE 6 


= —1 + ithis becomes —ix; + x2 = 0 and we can take [1 i] as an eigenvector. Similarly, an eigenvector 
corresponding to —1 —iis[1 —i]'. This gives the complex general solution 


1 ; 
ec l-at. 
=1 


The next step would be the transformation of this complex solution to a real general solution by the Euler 


1 ‘ 
y= o| jee + 
i 


formula. But, as in the last example, we just wanted to see what eigenvalues to expect in the case of a spiral 
point. Accordingly, we start again from the beginning and instead of that rather lengthy systematic calculation 
we use a shortcut. We multiply the first equation in (13) by yj, the second by ys, and add, obtaining 


yiyi + yey = —O7 + y3). 


We now introduce polar coordinates r, f, where r= yt oF ye. Differentiating this with respect to t gives 


2rr’ = 2y,y, + 2yey9. Hence the previous equation can be written 
| ns) | spas oe = % =. a= 
rr=-r’, Thus, r=-r, dr/r = —dt, In |r| = -t + c®, r=ce. 
For each real c this is a spiral, as claimed (see Fig. 86). | 
V2 


fe : 


Fig. 86. Trajectories of the system (13) (Spiral point) 


No Basis of Eigenvectors Available. Degenerate Node (Fig. 87) 


This cannot happen if A in (1) is symmetric (a,j; = jx, as in Examples 1—3) or skew-symmetric (a,j = —djx, 
thus aj; = 0). And it does not happen in many other cases (see Examples 4 and 5). Hence it suffices to explain 
the method to be used by an example. 

Find and graph a general solution of 


(14) y =ay= 


Solution. A is not skew-symmetric! Its characteristic equation is 


4-2 1 


det (A — AD) = 
=] 224 


146 


CHAP. 4 Systems of ODEs. Phase Plane. Qualitative Methods 


It has a double root A = 3. Hence eigenvectors are obtained from (4 — A)x1 + xg = 0, thus from x, + xg = 0, 
say, x =[1 —1]" and nonzero multiples of it (which do not help). The method now is to substitute 


y® = xte** + ue 


with constant u = [wy us|" into (14). (The xt-term alone, the analog of what we did in Sec. 2.2 in the case 
of a double root, would not be enough. Try it.) This gives 


y "= xe™ + Axte™ + Aue” Ay? Axte™ + Aue. 
On the right, Ax = Ax. Hence the terms Axte*’ cancel, and then division by e*” gives 
x + Au = Au, thus (A — ADu =x. 


Here A = 3 and x = [1 -1y', so that 


4-3 1 1 uy + ug=1 
(A — 3Du = u= P thus 
| 23 = Uy — ug = —-1. 


A solution, linearly independent of x = [1 —1)', isu = [0 1". This yields the answer (Fig. 87) 


et + ce 


=I 


eth) @) _ 
y=ciy” + coy -a 


The critical point at the origin is often called a degenerate node. ay? gives the heavy straight line, with 
cy > 0 the lower part and cy < 0 the upper part of it. y” gives the right part of the heavy curve from 0 through 


the second, first, and—finally—fourth quadrants. —y gives the other part of that curve. a 
Ye 
al 
(2) 
y 
(1) 
y 


Fig. 87. Degenerate node in Example 6 


We mention that for a system (1) with three or more equations and a triple eigenvalue 
with only one linearly independent eigenvector, one will get two solutions, as just 
discussed, and a third linearly independent one from 


y = 5Xt e+ ute® + ver with v from u + Av = Av. 
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PROBLEM SET 4-3 


1-9| GENERAL SOLUTION 


Find a real general solution of the following systems. Show 
the details. 


1. yp =y1 + ye 
t 
yo = 3y1 — ye 


- yi = 6y1 + 9y2 
ya = y1 + 6y2 


3. yi = y1 + 2ye 
y2 = 391 + ye 

4, yi) = —8y] — 2ye 
y2 = 2y1 — 4y2 

5. yi = 2y1 + 5ye 
ya = 5y1 + 12.5ye 

6. yi = 2y1 — 2ye 
ya = 2y1 + 2ye 

7. yi = Ye 
y2= yi t+ ys 
y3 = Ye 

8. yi = 8y1 — ye 
ya = v1 + 10yo 

9. yz = 10y; — 10y2 — 4y3 
y2 = —10y1 + yo — My 
yg = —4y1 — l4y2 — 2yg 

10-15 IVPs 

Solve the following initial value problems. 


10. y, = 2y1 + 2ye 


e 
ya = 5y1 — ya 


y1(0) = 0, y2(0) = 7 
11. y, = 2y, + Sye 

ya = —3y1 — 3Y2 

y1(0) = —12, ye(0) = 0 


12. yi = y1 + 3ye 
yo=3yit ye 
yy(O) = 12,  yo(0) = 2 


YI = y2 
ya a, 
y1(0) = 0, 


13. 


y2(0) = 2 
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14. yi = —y1 — yo 

Ye = Yi = V2: 

yz(0) = 1, yo(0) = 0 
15. y, = 3y, + 2y2 

ya = 2y1 + 3yz 

y1(0) = 0.5, ya(0) = —0.5 
16-17| CONVERSION 


Find a general solution by conversion to a single ODE. 

16. The system in Prob. 8. 

17. The system in Example 5 of the text. 

18. Mixing problem, Fig. 88. Each of the two tanks 
contains 200 gal of water, in which initially 100 lb 
(Tank 7,) and 200 lb (Tank 75) of fertilizer are dissolved. 
The inflow, circulation, and outflow are shown in 
Fig. 88. The mixture is kept uniform by stirring. Find 
the fertilizer contents y,(f) in 7 and yo(f) in Tb. 


12 gal/min 4 gal/min 


(Pure water) 


16 gal/min 12 gal/min 


Fig. 88. Tanks in Problem 18 


19. Network. Show that a model for the currents /;(f) and 
Io(t) in Fig. 89 is 


1 
I, dt 4 
a 


Find a general solution, assuming that R = 3Q, 
L=4H,C=1/12F. 


RU, — bb) =0, Li + Re — hh) = 0. 


Fig. 89. Network in Problem 19 
20. CAS PROJECT. Phase Portraits. Graph some of 
the figures in this section, in particular Fig. 87 on the 
degenerate node, in which the vector y@ depends on . 
In each figure highlight a trajectory that satisfies an 
initial condition of your choice. 
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4.4 Criteria for Critical Points. Stability 


We continue our discussion of homogeneous linear systems with constant coefficients (1). 
Let us review where we are. From Sec. 4.3 we have 
' 


; a1 4412 y1 = 4111 + ay2ye 
dd) y =Ay= y, in components, 


a2, dag Yo = d21y1 + a22Ye. 


From the examples in the last section, we have seen that we can obtain an overview of 
families of solution curves if we represent them parametrically as y(t) = [y1(4) yo(t)]" 
and graph them as curves in the y; yo-plane, called the phase plane. Such a curve is called 
a trajectory of (1), and their totality is known as the phase portrait of (1). 

Now we have seen that solutions are of the form 


y(t) = xe, Substitution into (1) gives y (t) = Axe = Ay = Axe’. 
Dropping the common factor e**, we have 
(2) Ax = Xx. 


Hence y(f) is a (nonzero) solution of (1) if A is an eigenvalue of A and x a corresponding 
eigenvector. 

Our examples in the last section show that the general form of the phase portrait is 
determined to a large extent by the type of critical point of the system (1) defined as a 
point at which dyg/dy, becomes undetermined, 0/0; here [see (9) in Sec. 4.3] 


dy2 _yodt agi + dagyo 
dy, yidt) ayyit areye 


(3) 


We also recall from Sec. 4.3 that there are various types of critical points. 
What is now new, is that we shall see how these types of critical points are related 
to the eigenvalues. The latter are solutions A = A and Ag of the characteristic equation 


a4,—A a2 


(4) det (A — AD = = dX? — (ay, + apg)A + det A = 0. 


a21 d22 — A 


This is a quadratic equation dN? — pA + q = 0 with coefficients p, q and discriminant A 
given by 


(5) p =a, +422, gq = det A = ay4d92 — ay2a21, A =p” — 4¢. 
From algebra we know that the solutions of this equation are 


(6) dM =3(p+ VA), An = 3 (p — VA). 
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DEFINITIONS 


Furthermore, the product representation of the equation gives 


dN? — pA +g = (A — Aq)(A — Ag) = A® = (Ay + Ag)A + AqAg. 


Hence p is the sum and q the product of the eigenvalues. Also Ay — Ag = VA from (6). 
Together, 


(7) P=Ay+Ax qg@=Aidz, A =(Ay — Az)”. 


This gives the criteria in Table 4.1 for classifying critical points. A derivation will be 
indicated later in this section. 


Table 4.1 Eigenvalue Criteria for Critical Points 
(Derivation after Table 4.2) 


Name p=Ay +r. | q@= Ade | A = (Ay — Ag) | Comments on Aj, Ag 
(a) Node q>0 A2O0 Real, same sign 
(b) Saddle point q<0 Real, opposite signs 
(c) Center p=0 q>0 Pure imaginary 
(d) Spiral point p#O0 A<0 Complex, not pure 
imaginary 


Stability 


Critical points may also be classified in terms of their stability. Stability concepts are basic 
in engineering and other applications. They are suggested by physics, where stability 
means, roughly speaking, that a small change (a small disturbance) of a physical system 
at some instant changes the behavior of the system only slightly at all future times ¢. For 
critical points, the following concepts are appropriate. 


Stable, Unstable, Stable and Attractive 


A critical point Po of (1) is called stable? if, roughly, all trajectories of (1) that at 
some instant are close to Py remain close to Pp at all future times; precisely: if for 
every disk D, of radius €« > 0 with center Po there is a disk Ds of radius 6 > 0 with 
center Po such that every trajectory of (1) that has a point P, (corresponding to t = 14, 
say) in Dg has all its points corresponding to t = fy in D,. See Fig. 90. 

Po is called unstable if Po is not stable. 

Po is called stable and attractive (or asymptotically stable) if Py is stable and 
every trajectory that has a point in Ds approaches Po as t— ~. See Fig. 91. 


Classification criteria for critical points in terms of stability are given in Table 4.2. Both 
tables are summarized in the stability chart in Fig. 92. In this chart region of instability 
is dark blue. 


In the sense of the Russian mathematician ALEXANDER MICHAILOVICH LJAPUNOV (1857-1918), 
whose work was fundamental in stability theory for ODEs. This is perhaps the most appropriate definition of 
stability (and the only we shall use), but there are others, too. 
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EXAMPLE 1 
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Fig. 90. Stable critical point P, of (1) Fig. 91. Stable and attractive critical 
(The trajectory initiating at P, stays point Py of (1) 
in the disk of radius e) 


Table 4.2 Stability Criteria for Critical Points 


Type of Stability p= tars | q = Mars 
(a) Stable and attractive p<0 q>0 
(b) Stable ps0 q>0 
(c) Unstable p>0o OR q<0 

q 


A>0 ‘Ge 


» 


Fig. 92. Stability chart of the system (1) with p, q, A defined in (5). 
Stable and attractive: The second quadrant without the q-axis. 
Stability also on the positive q-axis (which corresponds to centers). 
Unstable: Dark blue region 


We indicate how the criteria in Tables 4.1 and 4.2 are obtained. If g = AyAz > 0, both 
of the eigenvalues are positive or both are negative or complex conjugates. If also 
Pp = Ay + Ag < 0, both are negative or have a negative real part. Hence Pp is stable and 
attractive. The reasoning for the other two lines in Table 4.2 is similar. 

If A < 0, the eigenvalues are complex conjugates, say, Ay = a + iBand Ag = a — if. 
If also p = Ay + Ag = 2a < 0, this gives a spiral point that is stable and attractive. If 
p = 2a > 0, this gives an unstable spiral point. 

If p = 0, then Ag = —Aq and g = AyAg = —AZ. If also q > 0, then AZ = —q < 0, so 
that Aj, and thus Ag, must be pure imaginary. This gives periodic solutions, their trajectories 
being closed curves around Po, which is a center. 


Application of the Criteria in Tables 4.1 and 4.2 


3 1 

In Example 1, Sec 4.3, we have y’ = | y, p = —6,q = 8, A = 4, a node by Table 4.1(a), which is 
1 -3 

stable and attractive by Table 4.2(a). | 
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EXAMPLE 2 


Free Motions of a Mass on a Spring 
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What kind of critical point does my” + cy’ + ky = 0 in Sec. 2.4 have? 


Solution. Division by m gives y" = 


Then yz = y" = —(k/m)y1 — (c/m)yz. Hence 


: 0 1 
y= 
—k/m  —c/m 


We see that p 


| y, det (A — AT) = 


(c/m)* 


c/m, q = k/m, A 


—(k/m)y — (c/m)y'. To get a system, set yy = y, yo = y’ (see Sec. 4.1). 


—A 1 5 Cc k 
=\+—A+—=0. 
m m 


—k/m —c/m— xX 


4k/m. From this and Tables 4.1 and 4.2 we obtain the following 


results. Note that in the last three cases the discriminant A plays an essential role. 


No damping. c = 0, p = 0, g > 0, a center. 


Underdamping. c2 < Amk, p <0,q > 0, A <0, a stable and attractive spiral point. 
Critical damping. C= 4mk, p < 0,q > 0, A = 0, a stable and attractive node. 
Overdamping. C> 4mk, p < 0,q > 0, A > 0, a stable and attractive node. i] 


PROBLEM SET 4-4 


1-10 


TYPE AND STABILITY OF 


CRITICAL POINT 


Determine the type and stability of the critical point. Then 
find a real general solution and sketch or graph some of the 
trajectories in the phase plane. Show the details of your work. 


1. yy = yy 2. yj = —4y1 
ya = 2ye ya = —3y2 
3. 2 4. yi = 2y1 + Ya 
yo = —9y1 yo = 5y1 — 2y2 
5. yi = —2y1 + 2ye 6. yi = —6y1 — yo 
ya = —2y1 — 2ye ya = —9y1 — 6y2 
7. yt = yi + 2ye 8. yi = —y1 + 4y2 
, e 
yo = 2y1 + yo yo = 3y1 — 2ye2 
9. Y = 4y, + yo 10. Y1 = yo 
yo = 4y1 + 4y2 yo = —5y1 — 2y2 
11-18 TRAJECTORIES OF SYSTEMS AND 
SECOND-ORDER ODEs. CRITICAL 
POINTS 
11. Damped oscillations. Solve y” + 2y’ + 2y = 0. What 


12. 


13. 


14. 


kind of curves are the trajectories? 


Harmonic oscillations. Solve y” + 5y = 0. Find the 
trajectories. Sketch or graph some of them. 


Types of critical points. Discuss the critical points in 
(10)-(13) of Sec. 4.3 by using Tables 4.1 and 4.2. 


Transformation of parameter. What happens to the 
critical point in Example 1 if you introduce tT = —t as 
a new independent variable? 


15. 


16. 


17. 


18. 


19. 


20. 


Perturbation of center. What happens in Example 4 
of Sec. 4.3 if you change A to A + 0.11, where I is the 
unit matrix? 


Perturbation of center. If a system has a center as 
its critical point, what happens if you replace the 
matrix A by A = A + Kl with any real number k # 0 
(representing measurement errors in the diagonal 
entries)? 


Perturbation. The system in Example 4 in Sec. 4.3 
has a center as its critical point. Replace each aj, in 
Example 4, Sec. 4.3, by aj, + b. Find values of b such 
that you get (a) a saddle point, (b) a stable and attractive 
node, (c) a stable and attractive spiral, (d) an unstable 
spiral, (e) an unstable node. 


CAS EXPERIMENT. Phase Portraits. Graph phase 
portraits for the systems in Prob. 17 with the values 
of b suggested in the answer. Try to illustrate how 
the phase portrait changes “continuously” under a 
continuous change of b. 


WRITING PROBLEM. Stability. Stability concepts 
are basic in physics and engineering. Write a two-part 
report of 3 pages each (A) on general applications 
in which stability plays a role (be as precise as you 
can), and (B) on material related to stability in this 
section. Use your own formulations and examples; do 
not copy. 


Stability chart. Locate the critical points of the 
systems (10)-(14) in Sec. 4.3 and of Probs. 1, 3, 5 in 
this problem set on the stability chart. 
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4.5 Qualitative Methods for Nonlinear Systems 


Qualitative methods are methods of obtaining qualitative information on solutions 
without actually solving a system. These methods are particularly valuable for systems 
whose solution by analytic methods is difficult or impossible. This is the case for many 
practically important nonlinear systems 


; yt =fiO1 ye) 
(1) y = fy), thus 
yo = fo(y1, ya). 


In this section we extend phase plane methods, as just discussed, from linear systems 
to nonlinear systems (1). We assume that (1) is autonomous, that is, the independent 
variable t does not occur explicitly. (All examples in the last section are autonomous.) 
We shall again exhibit entire families of solutions. This is an advantage over numeric 
methods, which give only one (approximate) solution at a time. 

Concepts needed from the last section are the phase plane (the y, yo-plane), trajectories 
(solution curves of (1) in the phase plane), the phase portrait of (1) (the totality of these 
trajectories), and critical points of (1) (points (1, yg) at which both f{(y1, ya) and fo(1, ye) 
are zero). 

Now (1) may have several critical points. Our approach shall be to discuss one critical 
point after another. If a critical point Po is not at the origin, then, for technical 
convenience, we shall move this point to the origin before analyzing the point. More 
formally, if Po: (a, b) is a critical point with (a, b) not at the origin (0, 0), then we apply 
the translation 


Jy =)1 — 4, Yo=yo—b 


which moves Pp to (0, 0) as desired. Thus we can assume Py to be the origin (0, 0), and 
for simplicity we continue to write y1, ye (instead of V1, V2). We also assume that Pp is 
isolated, that is, it is the only critical point of (1) within a (sufficiently small) disk with 
center at the origin. If (1) has only finitely many critical points, that is automatically 
true. (Explain!) 


Linearization of Nonlinear Systems 


How can we determine the kind and stability property of a critical point Pp: (0, 0) of 
(1)? In most cases this can be done by linearization of (1) near Po, writing (1) as 
y = f(y) = Ay + h(y) and dropping h(y), as follows. 

Since Py is critical, f,(0, 0) = 0, fo(0, 0) = 0, so that f; and fo have no constant terms 
and we can write 


; Yt = 41 y1 + Ayay2 + hay Ya) 
(2) y =Ay+ hj), thus 


yo = dg1y1 + degye + ho(y1, ya). 


A is constant (independent of f) since (1) is autonomous. One can prove the following 
(proof in Ref. [A7], pp. 375-388, listed in App. 1). 
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THEOREM -1 


EXAMPLE=-1 


Linearization 


Tf f, and fz in (1) are continuous and have continuous partial derivatives in a 
neighborhood of the critical point Po: (0, 0), and if det A # 0 in (2), then the kind 
and stability of the critical point of (1) are the same as those of the linearized 
system 


(3) zee i Y1 = ay1yi + aey2 
y = Ay, thus 


i 
Y2 = 42191 + do2o. 


Exceptions occur if A has equal or pure imaginary eigenvalues; then (1) may have 
the same kind of critical point as (3) or a spiral point. 


Free Undamped Pendulum. Linearization 


Figure 93a shows a pendulum consisting of a body of mass m (the bob) and a rod of length L. Determine the 
locations and types of the critical points. Assume that the mass of the rod and air resistance are negligible. 


Solution. Step 1. Setting up the mathematical model. Let 6 denote the angular displacement, measured 
counterclockwise from the equilibrium position. The weight of the bob is mg (g the acceleration of gravity). It 
causes a restoring force mg sin @ tangent to the curve of motion (circular arc) of the bob. By Newton’s second 
law, at each instant this force is balanced by the force of acceleration mL6", where LQ” is the acceleration; 
hence the resultant of these two forces is zero, and we obtain as the mathematical model 

mL@” + mg sin @ = 0. 
Dividing this by mL, we have 


&§ 
(4) 6 kane 0 («=*), 


When @ is very small, we can approximate sin 6 rather accurately by @ and obtain as an approximate solution 
Acos Vkt + B sin Vkt, but the exact solution for any @ is not an elementary function. 


Step 2. Critical points (0, 0), (+27, 0), (+47, 0),---, Linearization. To obtain a system of ODEs, we set 
6 = y1, 0’ = yo. Then from (4) we obtain a nonlinear system (1) of the form 


yt = Ain ya) = Ye 
(4*) —_ . 
yo = fo(y1, y2) = —k sin yy. 


The right sides are both zero when yz = 0 and sin y; = 0. This gives infinitely many critical points (n77, 0), 
where n = 0, +1, +2, ---. We consider (0, 0). Since the Maclaurin series is 


sin yy = yy — yi See Ee Vi, 
the linearized system at (0, 0) is 


yi = y2 


ye = ~ky1. 


To apply our criteria in Sec. 4.4 we calculate p = ay, + dog = 0,q = detA =k = g/L(>0), and 
A= p — 4q = —4k. From this and Table 4.1(c) in Sec. 4.4 we conclude that (0, 0) is a center, which is always 
stable. Since sin 9 = sin yj is periodic with period 277, the critical points (n7r, 0), n = +2, +4,---, are all centers. 


Step 3. Critical points (+7, 0), (+37, 0), (+57, 0),---, Linearization. We now consider the critical point 
(ar, 0), setting 9 — 7 = y, and (@ — 7)’ = 0’ = yo. Then in (4), 


sin 0 = sin (y, + 77) sin yy yp t by? — te = yy 
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and the linearized system at (77, 0) is now 


0 ] 1 = Ye 
y' =Ay= y, thus 
k 0 ye = ky. 
We see that p = 0, g = —k (<0), and A = —4q = 4k. Hence, by Table 4.1(b), this gives a saddle point, which 
is always unstable. Because of periodicity, the critical points (n7r, 0),n = +1, +3,---, are all saddle points. 
These results agree with the impression we get from Fig. 93b. a 


mg sin @ 


mg 
(a) Pendulum (b) Solution curves yp(9,) of (4) in the phase plane 
Fig. 93. Example 1(C will be explained in Example 4.) 


Linearization of the Damped Pendulum Equation 


To gain further experience in investigating critical points, as another practically important case, let us see how 
Example 1 changes when we add a damping term c6’ (damping proportional to the angular velocity) to equation 
(4), so that it becomes 


(5) 6” + co’ + ksind =0 


where k > 0 and c = 0 (which includes our previous case of no damping, c = 0). Setting @ = y,, 0’ = yo, as 
before, we obtain the nonlinear system (use 0” = ys) 


t 
vi. Ye 
yg = —ksin yy — cya. 
We see that the critical points have the same locations as before, namely, (0, 0), (+77, 0), (+277, 0),---. We 


consider (0, 0). Linearizing sin y; ~ y, as in Example 1, we get the linearized system at (0, 0) 


0 1 yi = yo 


(6) y’ = Ay= 


y, thus 
-k -c 


y2 = —ky1 — cya. 
This is identical with the system in Example 2 of Sec. 4.4, except for the (positive!) factor m (and except for 
the physical meaning of y;). Hence for c = 0 (no damping) we have a center (see Fig. 93b), for small damping 
we have a spiral point (see Fig. 94), and so on. 

We now consider the critical point (77, 0). We set @ — 7 = yy, (0 — 7)’ = 6’ = yo and linearize 


sin 9 = sin (y, + 77) sin yy ~ —yy. 


This gives the new linearized system at (77, 0) 


(6*) y =Ay= 
k =e 
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For our criteria in Sec. 4.4 we calculate p = ay, + dog c,qg = detA k, and A p 4q c2 + 4k. 
This gives the following results for the critical point at (77, 0). 


No damping. c = 0, p = 0,¢ < 0, A > 0, a saddle point. See Fig. 93b. 
Damping. c > 0, p < 0,q¢ < 0, A > 0, a saddle point. See Fig. 94. 


Since sin y; is periodic with period 277, the critical points (+277, 0), (+477, 0),--- are of the same type as 
(0, 0), and the critical points (—77, 0), (+377, 0),--- are of the same type as (77, 0), so that our task is finished. 

Figure 94 shows the trajectories in the case of damping. What we see agrees with our physical intuition. 
Indeed, damping means loss of energy. Hence instead of the closed trajectories of periodic solutions in 
Fig. 93b we now have trajectories spiraling around one of the critical points (0, 0), (+277, 0),---. Even the 
wavy trajectories corresponding to whirly motions eventually spiral around one of these points. Furthermore, 
there are no more trajectories that connect critical points (as there were in the undamped case for the saddle 
points). a 


Y2 


Fig. 94. Trajectories in the phase plane for the damped pendulum in Example 2 


Lotka—Volterra Population Model 


EXAMPLE 3 _ Predator—Prey Population Model? 
This model concerns two species, say, rabbits and foxes, and the foxes prey on the rabbits. 


Step 1. Setting up the model. We assume the following. 


1. Rabbits have unlimited food supply. Hence, if there were no foxes, their number y,(‘) would grow 
exponentially, yj = ayy. 


2. Actually, y; is decreased because of the kill by foxes, say, at a rate proportional to y; yo, where yo(t) is 
the number of foxes. Hence yj = ay; — by ya, where a > 0 and b > 0. 


3. If there were no rabbits, then yo(t) would exponentially decrease to zero, yy = —ly2. However, yo is 
increased by a rate proportional to the number of encounters between predator and prey; together we 
have yg = —lyg + ky, ye, where k > 0 and / > 0. 


This gives the (nonlinear!) Lotka—Volterra system 


yt = AQ ya) = ay1 — by rye 
7) I ~ 
yo = fo(y1, y2) = kyiy2 — We. 


Introduced by ALFRED J. LOTKA (1880-1949), American biophysicist, and VITO VOLTERRA 
(1860-1940), Italian mathematician, the initiator of functional analysis (see [GR7] in App. 1). 
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Step 2. Critical point (0, 0), Linearization. We see from (7) that the critical points are the solutions of 


(7*) AiQ1 Ya) = yila — byz) = 0, fay, Ya) = yalky1 — 2 = 0. 
i 
The solutions are (yj, yo) = (0, 0) and (Z 2). We consider (0, 0). Dropping —byy ye and ky; yg from (7) gives 


the linearized system 


a 0 
y. 
0 -l 


Its eigenvalues are Ay = a > 0 and Ag = —/ < 0. They have opposite signs, so that we get a saddle point. 


Step. 3. Critical point (I/k, a/b), Linearization. We set y, = ¥; + I/k, y2 = Yo + a/b. Then the critical point 
(i/k, a/b) corresponds to (1, ¥2) = (0, 0). Since ¥, = yj, ¥2 = yo, we obtain from (7) [factorized as in (7*)] 


ap os 1\[ S a\] St = 
a (5 | ak b(i | *)| (5 i) Py) 


1 de = 
Gu!) t]-(u+8)m 


Dropping the two nonlinear terms —by yz and ky V2, we have the linearized system 


- Ib. 
(a) yi= “4 
(7) 
_, ake 
) =o 


The left side of (a) times the right side of (b) must equal the right side of (a) times the left side of (b), 


eee io Aetuidont ak 5 4s Ib 5 ‘ 
—— yi = Soyo. integration, a por = const. 
pot ee Ni g p> kee 


This is a family of ellipses, so that the critical point (J/k, a/b) of the linearized system (7**) is a center (Fig. 95). 
It can be shown, by a complicated analysis, that the nonlinear system (7) also has a center (rather than a spiral 
point) at (J/k, a/b) surrounded by closed trajectories (not ellipses). 

We see that the predators and prey have a cyclic variation about the critical point. Let us move counterclockwise 
around the ellipse, beginning at the right vertex, where the rabbits have a maximum number. Foxes are sharply 
increasing in number until they reach a maximum at the upper vertex, and the number of rabbits is then sharply 
decreasing until it reaches a minimum at the left vertex, and so on. Cyclic variations of this kind have 
been observed in nature, for example, for lynx and snowshoe hare near the Hudson Bay, with a cycle of about 
10 years. 

For models of more complicated situations and a systematic discussion, see C. W. Clark, Mathematical 


Bioeconomics: The Mathematics of Conservation, 3rd ed. Hoboken, NJ, Wiley, 2010. a 
V2 
a 
b 
| 
| 
| 
L nal 
k 


Fig. 95. Ecological equilibrium point and trajectory 
of the linearized Lotka—Volterra system (7**) 
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EXAMPLE 4 


Transformation to a First-Order Equation 
in the Phase Plane 


Another phase plane method is based on the idea of transforming a second-order 
autonomous ODE (an ODE in which ft does not occur explicitly) 


Fly, y',y") =0 


to first order by taking y = y as the independent variable, setting y’ = yg and transforming 
y” by the chain rule, 


Then the ODE becomes of first order, 


dys 
(8) P(n <2) = 
Y1 


and can sometimes be solved or treated by direction fields. We illustrate this for the 
equation in Example | and shall gain much more insight into the behavior of solutions. 


An ODE (8) for the Free Undamped Pendulum 


If in (4) 0” + ksin @ = 0 we set 0 = yy, 0’ = yo (the angular velocity) and use 


ar dyn dyg dy, dy dyo, a 
a we get yg = —K sin yy. 
dt in ii v2. g ae, 2 J1 
Separation of variables gives yo dyg = —k sin y, dy,. By integration, 
9) 33 = kcosyy + C (C constant). 


Multiplying this by mL”, we get 
4 m(Lyo)” — mL?k cos y= mLC. 


We see that these three terms are energies. Indeed, yz is the angular velocity, so that Lyg is the velocity and the 
first term is the kinetic energy. The second term (including the minus sign) is the potential energy of the pendulum, 
and mL?C is its total energy, which is constant, as expected from the law of conservation of energy, because 
there is no damping (no loss of energy). The type of motion depends on the total energy, hence on C, as follows. 

Figure 93b shows trajectories for various values of C. These graphs continue periodically with period 277 to 
the left and to the right. We see that some of them are ellipse-like and closed, others are wavy, and there are two 
trajectories (passing through the saddle points (n7r,0),n = +1, +3,---) that separate those two types of 
trajectories. From (9) we see that the smallest possible C is C = —k; then yg = 0, and cos y; = 1, so that the 
pendulum is at rest. The pendulum will change its direction of motion if there are points at which yp = 6’ = 0. 
Then kcos yy + C = 0 by (9). If yy = 7, then cos yy = —1 and C =k. Hence if —k << C<k, then the 
pendulum reverses its direction for a lyr = |0| < 7, and for these values of C with |C| < k the pendulum 
oscillates. This corresponds to the closed trajectories in the figure. However, if C > k, then yp = 0 is impossible 
and the pendulum makes a whirly motion that appears as a wavy trajectory in the y; yo-plane. Finally, the value 
C = k corresponds to the two “separating trajectories” in Fig. 93b connecting the saddle points. | 


The phase plane method of deriving a single first-order equation (8) may be of practical 
interest not only when (8) can be solved (as in Example 4) but also when a solution 
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is not possible and we have to utilize fields (Sec. 1.2). We illustrate this with a very 
famous example: 


Self-Sustained Oscillations. Van der Pol Equation 


There are physical systems such that for small oscillations, energy is fed into the system, whereas for large 
oscillations, energy is taken from the system. In other words, large oscillations will be damped, whereas for 
small oscillations there is “negative damping” (feeding of energy into the system). For physical reasons we 
expect such a system to approach a periodic behavior, which will thus appear as a closed trajectory in the phase 
plane, called a limit cycle. A differential equation describing such vibrations is the famous van der Pol equation* 


(10) y” — pd - yyy! +y=0 (ww > O, constant). 


It first occurred in the study of electrical circuits containing vacuum tubes. For wz = 0 this equation becomes 
y” + y = 0 and we obtain harmonic oscillations. Let u > 0. The damping term has the factor —y(1 — y?). 
This is negative for small oscillations, when y? <1, so that we have “negative damping,” is zero for y = 
(no damping), and is positive if y? > 1 (positive damping, loss of energy). If yz is small, we expect a limit cycle 
that is almost a circle because then our equation differs but little from y” + y = 0. If w is large, the limit cycle 
will probably look different. 

Setting y = y,, y’ = yo and using y” = (dy2/dyy)y2 as in (8), we have from (10) 


dya 


(nD) ye — WL — yidy, + 91 = 0. 
dy, 


The isoclines in the y,yg-plane (the phase plane) are the curves dy/dy, = K = const, that is, 


dy2 9 JI 
Ty K. 
dy, wl ~ ya) y2 


Solving algebraically for yg, we see that the isoclines are given by 


J1 
o=—— (Figs. 96, 97). 
w(L — yt) — K 
J2 
K=-} K=0 K=-1 


Fig. 96. Direction field for the van der Pol equation with ~ = 0.1 in the phase plane, 
showing also the limit cycle and two trajectories. See also Fig. 8 in Sec. 1.2 


4B ALTHASAR VAN DER POL (1889-1959), Dutch physicist and engineer. 
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Figure 96 shows some isoclines when x is small, 4 = 0.1, the limit cycle (almost a circle), and two (blue) trajectories 
approaching it, one from the outside and the other from the inside, of which only the initial portion, a small spiral, is 
shown. Due to this approach by trajectories, a limit cycle differs conceptually from a closed curve (a trajectory) 
surrounding a center, which is not approached by trajectories. For larger yz the limit cycle no longer resembles a 
circle, and the trajectories approach it more rapidly than for smaller yw. Figure 97 illustrates this for = 1. 


Fig. 97. Direction field for the van der Pol equation with jz = 1 in the phase plane, 
showing also the limit cycle and two trajectories approaching it 


PROBLEM SET 475 


1. 


Pendulum. To what state (position, speed, direction 4-8| CRITICAL POINTS. LINEARIZATION 
of motion) do the four points of intersection of a 
closed trajectory with the axes in Fig. 93b 
correspond? The point of intersection of a wavy curve 


Find the location and type of all critical points by 
linearization. Show the details of your work. 


Pe ag oo 
with the yg-axis? ss v1 ot Hi v1 Y2 — 
y2 = y2 y2 = “Yi © By1 

Limit cycle. What is the essential difference between 6. y1 = yo 7. yi = yi +yo- ye 
a limit cycle and a closed trajectory surrounding a ys = -y1 — y? yy = -y1 — yo 
center? ro 2 

8. y1 = Ya — ¥3 
CAS EXPERIMENT. Deformation of Limit Cycle. Jo> Yu YI 


Convert the van der Pol equation to a system. Graph 9-13] CRITICAL POINTS OF ODEs 
the limit cycle and some approaching trajectories for 


be = 0.2, 0.4, 0.6, 0.8, 1.0, 1.5, 2.0. Try to observe how 
the limit cycle changes its form continuously if you 
< : 5 ree ” 3 ” 3 
vary 2 continuously. Describe in words how the limit 9 y —9y + y" =0 10.y +y-y"=0 
cycle is deformed with growing p. 11. y” + cosy =0 12. y" + 9y + y? =0 


Find the location and type of all critical points by first 
converting the ODE to a system and then linearizing it. 
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14. 
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y" + siny = 0 

TEAM PROJECT.  Self-sustained oscillations. 
(a) Van der Pol equation. Determine the type of the 
critical point at (0,0) when pw > 0, uw = 0, < 0. 
(b) Rayleigh equation. Show that the Rayleigh 
equation? 

y” — wl — 4Y’)¥' +Y¥Y=0 (u>0) 

also describes self-sustained oscillations and that by 


differentiating it and setting y = Y’ one obtains the van 
der Pol equation. 


(c) Duffing equation. The Duffing equation is 
y" + woy + By? =0 

where usually || is small, thus characterizing a small 
deviation of the restoring force from linearity. B > 0 
and B < 0 are called the cases of a hard spring and a 
soft spring, respectively. Find the equation of the 
trajectories in the phase plane. (Note that for B > 0 all 
these curves are closed.) 


15. Trajectories. Write the ODE y” — 4y + y? =Oasa 


system, solve it for yg as a function of yy, and sketch 
or graph some of the trajectories in the phase plane. 


Yo 


Fig. 98. Trajectories in Problem 15 


4.6 Nonhomogeneous Linear Systems of ODEs 


In this section, the last one of Chap. 4, we discuss methods for solving nonhomogeneous 
linear systems of ODEs 


(1) y'=Ay+g (see Sec. 4.2) 


where the vector g(t) is not identically zero. We assume g(f) and the entries of the 
n Xn matrix A(t) to be continuous on some interval J of the ft-axis. From a general 
solution yo) of the homogeneous system y’ = Ay on J and a particular solution 
y(t) of (1) on J [i.e., a solution of (1) containing no arbitrary constants], we get a 
solution of (1), 


(2) Va ese 


y is called a general solution of (1) on J because it includes every solution of (1) on J. 
This follows from Theorem 2 in Sec. 4.2 (see Prob. | of this section). 

Having studied homogeneous linear systems in Secs. 4.1-4.4, our present task will be 
to explain methods for obtaining particular solutions of (1). We discuss the method of 


5LORD RAYLEIGH (JOHN WILLIAM STRUTT) (1842-1919), English physicist and mathematician, 
professor at Cambridge and London, known by his important contributions to the theory of waves, elasticity 
theory, hydrodynamics, and various other branches of applied mathematics and theoretical physics. In 1904 he 
was awarded the Nobel Prize in physics. 
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EXAMPLE-1 


undetermined coefficients and the method of the variation of parameters; these have 
counterparts for a single ODE, as we know from Secs. 2.7 and 2.10. 


Method of Undetermined Coefficients 


Just as for a single ODE, this method is suitable if the entries of A are constants and 
the components of g are constants, positive integer powers of t, exponential functions, 
or cosines and sines. In such a case a particular solution y? is assumed in a form similar 
to g; for instance, y” =u+vw+ wi? if g has components quadratic in f, with u, v, 
w to be determined by substitution into (1). This is similar to Sec. 2.7, except for the 


Modification Rule. It suffices to show this by an example. 


Method of Undetermined Coefficients. Modification Rule 


Find a general solution of 


3 1 


| -6 
yar et 


Solution. A general equation of the homogeneous system is (see Example | in Sec. 4.3) 


(3) yo ayes] 
1 -3 


1 
—4t 


1 
(4) y = “| | eo 
1 


Since A = —2 is an eigenvalue of A, the function e 


the Modification Rule by setting 


* on the right side also appears in y®, and we must apply 


(p) 2t 


y= ute~7" + ye7 ae. 


(rather than ue 
Note that the first of these two terms is the analog of the modification in Sec. 2.7, but it would not be sufficient 
here. (Try it.) By substitution, 


(p)! —2t 2t 


ue 2ute™ 


y Qve~** = Aute** + Ave?" 4 g. 


Equating the te~*"terms on both sides, we have —2u = Au. Hence u is an eigenvector of A corresponding to 
A = —2; thus [see (5)] u = a[l yy with any a # 0. Equating the other terms gives 


—6 a 2u, —3vu, + ve 
thus a = 
2 a 2v2 vy — 3v2 


Collecting terms and reshuffling gives 
vy — Vg = -a-6 


u— 2v=Avt+ + 


—v, + vg = -a +2. 


By addition, 0 = —2a — 4,a = —2, and then vg = vy + 4, say, vy = k, vg = k + 4, thus, v = [k k+ ay". 
We can simply choose k = 0. This gives the answer 
1 1 
e #2 te + 
=I 1 


For other k we get other v; for instance, k = —2 gives v = [—2 ay", so that the answer becomes 


1 1 
et 2 te 4 
—-1 1 


—2t 
4 


1 
(5) y=yP+tyP=q | | et + op 
1 


1 


1 =2 
(5*) yrad | | en + oy | et etc. | 
2 
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Method of Variation of Parameters 


This method can be applied to nonhomogeneous linear systems 


(6) y =A(y + 2) 


with variable A = A(f) and general g(f). It yields a particular solution y? of (6) on some 
open interval J on the t-axis if a general solution of the homogeneous system y’ = A(d)y 
on J is known. We explain the method in terms of the previous example. 


Solution by the Method of Variation of Parameters 


Solve (3) in Example 1. 


Solution. A basis of solutions of the homogeneous system is [e 2! Pam ie and le *# =e AH! Hence 
the general solution (4) of the homogeneous system may be written 
’ ent ele 
(7) ye S| x a = Y(f)e. 
e —e C2, 


Here, Y(t) = Ly? yor is the fundamental matrix (see Sec. 4.2). As in Sec. 2.10 we replace the constant 
vector ¢ by a variable vector u(t) to obtain a particular solution 


y” = Y(‘)u(s). 
Substitution into (3) y’ = Ay + g gives 
(8) Y’u + Yu’ = AYu+ g. 


Now since y? and yo are solutions of the homogeneous system, we have 


yr! = Ay, yo” = Ay”, thus y’ = AY. 
Hence Y’u = AYu, so that (8) reduces to 
Yu’ =g. The solution is u’ = Y-1g: 


here we use that the inverse Y~* of Y (Sec. 4.0) exists because the determinant of Y is the Wronskian W, which 
is not zero for a basis. Equation (9) in Sec. 4.0 gives the form of Y~+, 


— i et et —~6e72¢] “a 4 7 a) 
mS a) eat _ pt Qe~2t 9 _gp2t ~ —4e2t |" 


Integration is done componentwise (just as differentiation) an 


(where + 2 comes from the lower limit of integration). From 


=2t 


—2te~?* — 2e-?! + 20H 


d gives 


—2t 


| —2e* + 2 


this and Y in (7) we obtain 


eT 2t ett 
ew 2t ett 


2e™* +4 , | 2te~** + 20?" 
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The last term on the right is a solution of the homogeneous system. Hence we can absorb it into y®. We thus 
obtain as a general solution of the system (3), in agreement with (5*). 


1 
(9) =a Jem 


1 


1. Prove that (2) includes every solution of (1). 


2-7| GENERAL SOLUTION 


Find a general solution. Show the details of your work. 


2. yy =¥1 + yo + 10 cost 
yg = 3y1 — yo - 10 sin t 


4. yy = 4y1 — 8yq + 2 cosht 
cosh t + 2 sinh t 
5. yy = 4y1 + yo + 0.68 

ya = 2y1 + 3y2 — 2.5¢ 


7. yy = —3y, — 4ye + Ie + 15 
y2 = Sy, + 6y2 + 3e7* — 15¢ — 20 

8. CAS EXPERIMENT. Undetermined Coefficients. 
Find out experimentally how general you must choose 
y”, in particular when the components of g have 
a different form (e.g., as in Prob. 7). Write a short 
report, covering also the situation in the case of the 
modification rule. 


9. Undetermined Coefficients. Explain why, in Example 
1 of the text, we have some freedom in choosing the 
vector v. 


10-15; INITIAL VALUE PROBLEM 
Solve, showing details: 
10. yy = —3y, — 4y2 + See 

yz = 5y1 + 6y2 — be" 

y1(0) = 19, y2(0) = —23 


11. yt, = yo + 6e7 
Pe 2t 
JAM Ve. 
y10) = 1, ya(0) = 
12. yj = y1 + 4y2 - 12 + 6f 
yg=yrtye-t?+t-1 


y1(0) = 2, ya(0) = —1 
13. yj = yo — Ssint 
yo = —4y, + 17 cost 


y1(0) = 5, ye(0) = 2 
14. y, = 4yo + 5eé 

ya = —y1 — 20e~* 

y1(0) = 1, ye(0) = 0 


1 1 
et 2 te 2 4 
-1 1 


=) 
et | 
2 


PROBLEEM—SET 4-6 


15. y, = y1 + 2yq + e%* — 24 
ya=—yat1+t 
yO) = 1, yo(0) = —4 

16. WRITING PROJECT. Undetermined Coefficients. 
Write a short report in which you compare the 
application of the method of undetermined coefficients 
to a single ODE and to a system of ODEs, using ODEs 
and systems of your choice. 


NETWORK 


Find the currents in Fig. 99 (Probs. 17-19) and Fig. 100 
(Prob. 20) for the following data, showing the details of 
your work. 


17. Ry = 20, Re 


80,L = 1H,C=0.5F,E = 200 V 


18. Solve Prob. 17 with E = 440 sin ¢ V and the other data 
as before. 


19. In Prob. 17 find the particular solution when currents 
and charge at t = 0 are zero. 


Switch C 


Fig. 99. Problems 17-19 


20. Ry = 10, Ro = 140, Ly = 08H, Ly = 1H, 
E = 100 V, (0) = 1,(0) = 0 


Problem 20 


Fig. 100. 
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CHAP TER—4-REVIEW-QUESTIONS AND PROBLEMS 


1. State some applications that can be modeled by systems 
of ODEs. 

2. What is population dynamics? Give examples. 

3. How can you transform an ODE into a system of ODEs? 

4. What are qualitative methods for systems? Why are they 
important? 

5. What is the phase plane? The phase plane method? A 
trajectory? The phase portrait of a system of ODEs? 


6. What are critical points of a system of ODEs? How did 
we classify them? Why are they important? 


7. What are eigenvalues? What role did they play in this 
chapter? 

8. What does stability mean in general? In connection with 
critical points? Why is stability important in engineering? 

9. What does linearization of a system mean? 


10. Review the pendulum equations and their linearizations. 


11-17} GENERAL SOLUTION. CRITICAL POINTS 


Find a general solution. Determine the kind and stability of 
the critical point. 


11. yj, = 2ye 12. yj, = 5y1 
y2 = 8y1 y2 = ya 
13. yj = —2y, + Sye 14. yy = 3y, + 4y2 
y2 = —y1 — Oye y2 = 3y1 + 2ye 
15. y) = —3y1 — 2ye 16. y, = 4y2 
y2 = —2y1 — 3ya ya = —4y1 
17. yy = —y1 + 2y2 
y2 = —2y1 — yo 


18-19 | CRITICAL POINT 
What kind of critical point does y’ = Ay have if A has the 
eigenvalues 


18. —4 and 2 19, 2 + 31,2 — 3i 


20-23 | NONHOMOGENEOUS SYSTEMS 
Find a general solution. Show the details of your work. 
20. yy = 29; + 2yg +e? 

Jo = —2y1 — 3ya + 
21. yt = 4yo 
ya = 4y1 + 321? 


22. yy =y, + yo + sint 
y2 = 4y1 + Yo 

23. yj = y1 + 4yq — 2 cost 
yg = y1 + yo — cost + sint 


24. Mixing problem. Tank 7; in Fig. 101 initially contains 
200 gal of water in which 160 lb of salt are dissolved. 
Tank 7; initially contains 100 gal of pure water. Liquid 
is pumped through the system as indicated, and the 
mixtures are kept uniform by stirring. Find the amounts 
of salt y1(f) and ya(t) in 7%, and 7, respectively. 


Water, 
10 gal/min 


Mixture, 
10 gal/min 


16 gal/min 


Tanks in Problem 24 


Fig. 101. 


25. Network. Find the currents in Fig. 102 when 
R=250,L=1H, C = 0.04F, E(f) = 169 sint V, 
1,(0) = 0, [g(0) = 0. 


Fig. 102. Network in Problem 25 


26. Network. Find the currents in Fig. 103 when R = 1 Q, 
L= 1.25H, C= 0.2 F, (0) = 1A, (0) = 1A. 


Network in Problem 26 


Fig. 103. 


27-30 | LINEARIZATION 


Find the location and kind of all critical points of the given 
nonlinear system by linearization. 


27. yi = ye 28. y1 = cos yo 
ya=y1- i ya = 3y1 

29. yy = —4yo 30. yy = 2y2 + yh 
ya = sin yy y2 = —8y1 


Summary of Chapter 4 
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SUMMARY-OF-CHAPTER-4 


Systems of ODEs. Phase Plane. Qualitative Methods 


Whereas single electric circuits or single mass—spring systems are modeled by 
single ODEs (Chap. 2), networks of several circuits, systems of several masses 
and springs, and other engineering problems lead to systems of ODEs, involving 
several unknown functions y (f),---,y,(f). Of central interest are first-order 
systems (Sec. 4.2): 


y4 = fil, y1.°°*5 Yn) 


y =f(ty), in components, 
Yn a Slt Yi 's Yn)» 


to which higher order ODEs and systems of ODEs can be reduced (Sec. 4.1). In 
this summary we let n = 2, so that 


; yi = Alt y1, Ya) 
(1) y =f(ty), in components, 


ya = falt, y1, ye). 


Then we can represent solution curves as trajectories in the phase plane (the 
y1ye-plane), investigate their totality [the “phase portrait’ of (1)], and study the kind 
and stability of the critical points (points at which both f; and fg are zero), and 
classify them as nodes, saddle points, centers, or spiral points (Secs. 4.3, 4.4). These 
phase plane methods are qualitative; with their use we can discover various general 
properties of solutions without actually solving the system. They are primarily used 
for autonomous systems, that is, systems in which ¢ does not occur explicitly. 


A linear system is of the form 
a1 42 Bal §1 
é] y = oy g = . 
d21 422 y2 §2 


If g = 0, the system is called homogeneous and is of the form 


(2) y’=Ay+g, where A= 


(3) y’ = Ay. 


If a41,°**, dag are constants, it has solutions y = xe**, where A is a solution of the 
quadratic equation 


ay—A ay2 
= (441 — A)(a22 — A) — ayga21 = O 


a21 a22, 
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and x # 0 has components x4, x2 determined up to a multiplicative constant by 
(a41 _ A)xy + ayoxX2 = 0. 


(These A’s are called the eigenvalues and these vectors x eigenvectors of the 
matrix A. Further explanation is given in Sec. 4.0.) 

A system (2) with g # 0 is called nonhomogeneous. Its general solution is of 
the form y = yp, + Yp, where yy, is a general solution of (3) and y, a particular 
solution of (2). Methods of determining the latter are discussed in Sec. 4.6. 

The discussion of critical points of linear systems based on eigenvalues is 
summarized in Tables 4.1 and 4.2 in Sec. 4.4. It also applies to nonlinear systems 
if the latter are first linearized. The key theorem for this is Theorem | in Sec. 4.5, 
which also includes three famous applications, namely the pendulum and van der 
Pol equations and the Lotka—Volterra predator-prey population model. 


CHAPTER 5 


Series Solutions of ODEs. 
Special Functions 


In the previous chapters, we have seen that linear ODEs with constant coefficients can be 
solved by algebraic methods, and that their solutions are elementary functions known from 
calculus. For ODEs with variable coefficients the situation is more complicated, and their 
solutions may be nonelementary functions. Legendre’s, Bessel’s, and the hypergeometric 
equations are important ODEs of this kind. Since these ODEs and their solutions, the 
Legendre polynomials, Bessel functions, and hypergeometric functions, play an important 
role in engineering modeling, we shall consider the two standard methods for solving 
such ODEs. 

The first method is called the power series method because it gives solutions in the 
form of a power series dg + ayx + ao x" + agx? Bente, 

The second method is called the Frobenius method and generalizes the first; it gives 
solutions in power series, multiplied by a logarithmic term In x or a fractional power x’, 
in cases such as Bessel’s equation, in which the first method is not general enough. 

All those more advanced solutions and various other functions not appearing in calculus 
are known as higher functions or special functions, which has become a technical term. 
Each of these functions is important enough to give it a name and investigate its properties 
and relations to other functions in great detail (take a look into Refs. [GenRef1], 
[GenRef10], or [All] in App. 1). Your CAS knows practically all functions you will ever 
need in industry or research labs, but it is up to you to find your way through this vast 
terrain of formulas. The present chapter may give you some help in this task. 


COMMENT. You can study this chapter directly after Chap. 2 because it needs no 
material from Chaps. 3 or 4. 


Prerequisite: Chap. 2. 
Section that may be omitted in a shorter course: 5.5. 
References and Answers to Problems: App. | Part A, and App. 2. 


5.1 Power Series Method 


The power series method is the standard method for solving linear ODEs with variable 
coefficients. It gives solutions in the form of power series. These series can be used 
for computing values, graphing curves, proving formulas, and exploring properties of 
solutions, as we shall see. In this section we begin by explaining the idea of the power 
series method. 
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EXAMPLE 1 


EXAMPLE 2 
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From calculus we remember that a power series (in powers of x — xo) is an infinite 
series of the form 


(1) = A(X — Xo) = dg + a(x — XQ) + do(x — Kol aE ge. 


m=0 


Here, x is a variable. dg, a1, dg,°** are constants, called the coefficients of the series. 
Xo is a constant, called the center of the series. In particular, if x9 = 0, we obtain a power 
series in powers of x 


(2) S Amx™” = dg + ax + ax? + agx® +--. 
m=0 


We shall assume that all variables and constants are real. 

We note that the term “power series” usually refers to a series of the form (1) [or (2)] 
but does not include series of negative or fractional powers of x. We use m as the 
summation letter, reserving n as a standard notation in the Legendre and Bessel equations 
for integer values of the parameter. 


Familiar Power Series are the Maclaurin series 


(|x| < 1, geometric series) 


Ix m=0 
ct m 2 3 
x x x 
v= 3 l+x4 bores 
mao MM! 2! 3! 
20 (-1)™x?™ x2 x4 
cos x > 1 n Hes 
(2m)! 2! 4! 
; rd (- 1%?" 1 x3 x? 
sin x > t t | 


X® 
(2m + 1)! 31S! 


Idea and Technique of the Power Series Method 


The idea of the power series method for solving linear ODEs seems natural, once we 
know that the most important ODEs in applied mathematics have solutions of this form. 
We explain the idea by an ODE that can readily be solved otherwise. 


Power Series Solution. Solve y’ — y = 0. 


Solution. In the first step we insert 


(2) » 
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EXAMPLE 3 


and the series obtained by termwise differentiation 


e 
(3) y’ =a, + 2agx + 3agx7 +-:: » MA, x1 
We 


into the ODE: 


(ay + 2agx 4 3agx? ts++) — (ag + ayx + gx” fee+) =, 


Then we collect like powers of x, finding 


(a, — ao) + (2ag — ay)x + (Bag — ag)x2 +--+ = 0. 


Equating the coefficient of each power of x to zero, we have 


a, — a9 = 0,7 2ag — ay = 0, 3a3 — dz = 0,°°° 
Solving these equations, we may express qd, d2,-+- in terms of ado, which remains arbitrary: 
ay do ag do 
a, = do, a2 > a3 eon 
2 2! 2° 3! 


With these values of the coefficients, the series solution becomes the familiar general solution 


2 3 
, 902 5 90 34 Oe ee x 
y = do + dox 4 i + aie ee? do{ 1+x4 + aoe’ 


Test your comprehension by solving y” + y=0 by power series. You should get the result 
y = agcos x + ay sinx. a 


We now describe the method in general and justify it after the next example. For a given 
ODE 


(4) y" + p@y’ + q@y = 0 


we first represent p(x) and q(x) by power series in powers of x (or of x — x9 if solutions 
in powers of x — xo are wanted). Often p(x) and q(x) are polynomials, and then nothing 
needs to be done in this first step. Next we assume a solution in the form of a power series 
(2) with unknown coefficients and insert it as well as (3) and 


(5) y” = ag + 3+ 2agx + 4+ 3agx2 +--+ = > mm — Dax? 


mM=2 


into the ODE. Then we collect like powers of x and equate the sum of the coefficients of 
each occurring power of x to zero, starting with the constant terms, then taking the terms 
containing x, then the terms in x”, and so on. This gives equations from which we can 
determine the unknown coefficients of (3) successively. 


A Special Legendre Equation. The ODE 


(1 — x%)y" — 2xy’ + 2y = 0 


occurs in models exhibiting spherical symmetry. Solve it. 
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Solution. Substitute (2), (3), and (5) into the ODE. (1 — x?)y" gives two series, one for y" and one for 


—xy’" In the term —2xy’ use (3) and in 2y use (2). Write like powers of x vertically aligned. This gives 


y” = 2aq + 6agx + 12agx* + 20a5x* + 30agx* 4 
—x2y" = Qa x" 6a3x° 12aqx* 
2xy" 2a,x Aayx? 6a3x° 8aqx* 
2y = 2ag + 2ayx 4 Qagx? + 2agx? + agxt +++. 
Add terms of like powers of x. For each power x°, Ry x2, ++ equate the sum obtained to zero. Denote these sums 


by [0] (constant terms), [1] (first power of x), and so on: 


Sum Power Equations 


[0] [x9] dz = —ao 
[1] [x] ag = 0 
[2] [x7] 12aq = 4ae, a4 = {242 = —3a9 
[3] [x3] a5 = 0 since ag =0 
[4] [x4] 30dag = 18a4, ag 38 a4 48 ( dag a9. 
This gives the solution 
y =ayx + ag(1 x? 3x4 3x6 Hace) 


ag and a, remain arbitrary. Hence, this is a general solution that consists of two solutions: x and 
1x2 - 3x4 = 3,6 — +++, These two solutions are members of families of functions called Legendre polynomials 
P,(x) and Legendre functions Q(x); here we have x = P,(x) and 1 x ix 2,8 ee Q,(x). The 
minus is by convention. The index | is called the order of these two functions and here the order is 1. More on 


Legendre polynomials in the next section. | 


Theory of the Power Series Method 


The nth partial sum of (1) is 


(6) Sp(X) = Ag + ay(x — Xq) + a(x — x0) + +++ + an(x — x0)" 
where n = 0, 1,---. If we omit the terms of s,, from (1), the remaining expression is 
(7) Rl) = Gye = xo) epee ag 


This expression is called the remainder of (1) after the term a,(x — xo)". 
For example, in the case of the geometric series 


Te yt se eee te ye ees 
we have 
so=l, Ro =xtx24+x3 4-5, 
sy = 1 +x, RpHxr tx Ptaxttees, 


So=1Ltxt x2, Rg =x? t+xttxr tees, etc. 
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In this way we have now associated with (1) the sequence of the partial sums 
So(X), 51(X), So(x), ++. If for some x = x, this sequence converges, say, 


lim sp(x1) = sr), 
no 


then the series (1) is called convergent at x = x 1, the number s(x,) is called the value 
or sum of (1) at x1, and we write 


eo) 


s(x) = DS am(xy — xo)”. 


m=0 


Then we have for every n, 


(8) S(Xq) = Sn(%q) + Rn (xy). 


If that sequence diverges at x = x1, the series (1) is called divergent at x = xj. 
In the case of convergence, for any positive e there is an N (depending on €) such that, 
by (8) 


(9) IRn(xpl = Is) — sp(xy)| < € for alln > N. 


Geometrically, this means that all s,,(x1) with n > N lie between s(x1) — € and s(x) + € 
(Fig. 104). Practically, this means that in the case of convergence we can approximate the 
sum s(x1) of (1) at x1 by s,,(x 1) as accurately as we please, by taking n large enough. 


aaa 


s(x,)-€ (x,) s(x) +E 


Fig. 104. Inequality (9) 


Where does a power series converge? Now if we choose x = x9 in (1), the series reduces 
to the single term ap because the other terms are zero. Hence the series converges at x9. 
In some cases this may be the only value of x for which (1) converges. If there are other 
values of x for which the series converges, these values form an interval, the convergence 
interval. This interval may be finite, as in Fig. 105, with midpoint x9. Then the series (1) 
converges for all x in the interior of the interval, that is, for all x for which 


(10) |x _ al <R 


and diverges for |x — x9| > R. The interval may also be infinite, that is, the series may 


converge for all x. 
Divergence Convergence Divergence 
R. ai R 


x )—-R Xo x +R 


Fig. 105. Convergence interval (10) of a power series with center x, 
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The quantity R in Fig. 105 is called the radius of convergence (because for a complex 
power series it is the radius of disk of convergence). If the series converges for all x, we 
set R = © (and 1/R = 0). 

The radius of convergence can be determined from the coefficients of the series by 
means of each of the formulas 


Am+1 


(11) 


(a) R= 1/ tim VT (b) R= 1/ tim 


am 


provided these limits exist and are not zero. [If these limits are infinite, then (1) converges 
only at the center x9.] 


Convergence Radius R = ~, 1, 0 


For all three series let m — 


eo S x eight asics Am+1 1/(m + 1)! 1 i pcs 
m=o i! 2! ; ay, 1/m! mt 1 ; 
La Shai heh ga pet ei R=1 
l= % pao am 1 
= mn 7 am+1 (m + 1)! 
> mix = +x + 2x" +--+, 7 al m+1—>2 R=0 
vi +4 


m=0 


Convergence for all x (R = ~) is the best possible case, convergence in some finite interval the usual, and 
convergence only at the center (R = 0) is useless. ai] 


When do power series solutions exist? Answer: if p, q, r in the ODEs 


(12) y” + p@y’ + a@y = rx) 

have power series representations (Taylor series). More precisely, a function f(x) is called 
analytic at a point x = xo if it can be represented by a power series in powers of x — xo 
with positive radius of convergence. Using this concept, we can state the following basic 
theorem, in which the ODE (12) is in standard form, that is, it begins with the y”. If 
your ODE begins with, say, h(x)y”, divide it first by h(x) and then apply the theorem to 
the resulting new ODE. 


Existence of Power Series Solutions 


If p, g, and r in (12) are analytic at x = Xo, then every solution of (12) is analytic 
at X = Xg and can thus be represented by a power series in powers of x — Xg with 
radius of convergence R > 0. 


The proof of this theorem requires advanced complex analysis and can be found in Ref. 
[A11] listed in App. 1. 

We mention that the radius of convergence R in Theorem | is at least equal to the distance 
from the point x = xg to the point (or points) closest to xg at which one of the functions 
P. q, v, as functions of a complex variable, is not analytic. (Note that that point may not 
lie on the x-axis but somewhere in the complex plane.) 
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Further Theory: Operations on Power Series 


In the power series method we differentiate, add, and multiply power series, and we obtain 
coefficient recursions (as, for instance, in Example 3) by equating the sum of the 
coefficients of each occurring power of x to zero. These four operations are permissible 
in the sense explained in what follows. Proofs can be found in Sec. 15.3. 


1. Termwise Differentiation. A power series may be differentiated term by term. More 
precisely: if 


) 


yx) = SS) adm(x — x0)” 


m=0 


converges for |x — xo| < R, where R > 0, then the series obtained by differentiating term 
by term also converges for those x and represents the derivative y’ of y for those x: 


oo 


y¥@) = SS man(x — x0)" * (lx — xol < B). 


m=1 


Similarly for the second and further derivatives. 


2. Termwise Addition. Two power series may be added term by term. More precisely: 
if the series 


) 


(13) >. tnt ag) and > bul =20)" 


m=0 m=0 


have positive radii of convergence and their sums are f(x) and g(x), then the series 


= (am + Diy) _ Xo)" 


m=0 


converges and represents f(x) + g(x) for each x that lies in the interior of the convergence 
interval common to each of the two given series. 


3. Termwise Multiplication. Two power series may be multiplied term by term. More 
precisely: Suppose that the series (13) have positive radii of convergence and let f(x) and 
g(x) be their sums. Then the series obtained by multiplying each term of the first series 


by each term of the second series and collecting like powers of x — xo, that is, 


dobo + (agb + aybo)(x = Xo) + (agbs + ayby + dgbo)(x —_ xo a 
= >) Gobm + arbm—-1 + +++ + ambo)(x — x0) 
m=0 


converges and represents f(x)g(x) for each x in the interior of the convergence interval of 
each of the two given series. 
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4. Vanishing of All Coefficients (“Jdentity Theorem for Power Series.” ) If a power 
series has a positive radius of convergent convergence and a sum that is identically zero 
throughout its interval of convergence, then each coefficient of the series must be zero. 


1. WRITING AND LITERATURE PROJECT. Power 
Series in Calculus. (a) Write a review (2-3 pages) on 
power series in calculus. Use your own formulations and 
examples—do not just copy from textbooks. No proofs. 
(b) Collect and arrange Maclaurin series in a systematic 
list that you can use for your work. 


2-5} REVIEW: RADIUS OF CONVERGENCE 


Determine the radius of convergence. Show the details of 
your work. 


2; > (m + 1)mx™ 


m=0 
ss (-1)”™ 
3 > a zm 
m=0 
oo 2m+1 
42 
iS, m+ 0! 
Co 2 m 
5 ine 2m 
> (3) 
m=0 


6-9| SERIES SOLUTIONS BY HAND 


Apply the power series method. Do this by hand, not by a 
CAS, to get a feel for the method, e.g., why a series may 
terminate, or has even powers only, etc. Show the details. 


6. (1+ xy =y 

7. y' = —2xy 

8. xy’ — 3y = k(= const) 
9% yy" +y=0 


10-14; SERIES SOLUTIONS 


Find a power series solution in powers of x. Show the details. 


10. y" —y’ +xy =0 
11. y" —y' +x?y =0 
12. (1 — x2)y"” — 2xy’ + 2y = 0 
13. y’ +(1 + x%)y =0 
14. y” — 4xy’ + (4x? 


2)y =0 


PROBLEM SET 5-1 


15. Shifting summation indices is often convenient or 
necessary in the power series method. Shift the index 
so that the power under the summation sign is x”. 
Check by writing the first few terms explicity. 


a ai a 2 

pel 1) sol > P xPt4 
24] 1)! 

s=28 p=10P + 1)! 


16-19 | CAS PROBLEMS. IVPs 


Solve the initial value problem by a power series. Graph 
the partial sums of the powers up to and including x°. Find 
the value of the sum s (5 digits) at x}. 


16. y) +4y=1, yO) = 1.25, x, =0.2 


17. y” + 3xy’ + 2y=0, yO) =1, yO) =1, 
x = 0.5 


18. (1 — x”)y” — 2xy’ + 30y = 0, (0) = 0, 
y'(0) = 1.875, x1 = 0.5 


19. (x — 2)y' = xy, yO) =4, x1 =2 


20. CAS Experiment. Information from Graphs of 
Partial Sums. In numerics we use partial sums of 
power series. To get a feel for the accuracy for various 
x, experiment with sin x. Graph partial sums of the 

Maclaurin series of an increasing number of terms, 

describing qualitatively the “breakaway points” of 

these graphs from the graph of sin x. Consider other 

Maclaurin series of your choice. 


Fig. 106. CAS Experiment 20. sin x and partial 
SUMS 53, Ss, 57 
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5.2 Legendre’s Equation. 
Legendre Polynomials P,,(x) 


Legendre’s differential equation’ 
(1) (1 — x)y"” — 2xy’ + n(n + Dy =0 (n constant) 


is one of the most important ODEs in physics. It arises in numerous problems, particularly 
in boundary value problems for spheres (take a quick look at Example 1 in Sec. 12.10). 

The equation involves a parameter n, whose value depends on the physical or 
engineering problem. So (1) is actually a whole family of ODEs. For n = 1 we solved it 
in Example 3 of Sec. 5.1 (look back at it). Any solution of (1) is called a Legendre function. 
The study of these and other “higher” functions not occurring in calculus is called the 
theory of special functions. Further special functions will occur in the next sections. 

Dividing (1) by 1 — x”, we obtain the standard form needed in Theorem | of Sec. 5.1 
and we see that the coefficients —2x/(1 — x?) and n(n + 1)/(. — x) of the new equation 
are analytic at x = 0, so that we may apply the power series method. Substituting 


(2) y= > anx™ 
m=0 


and its derivatives into (1), and denoting the constant n(n + 1) simply by k, we obtain 


C= x?) > mm — ane * = 2x > Manx” 1 +k S diy = 0, 


M=2 m=1 m=0 


By writing the first expression as two separate series we have the equation 


co 


> mn — Dex = SY mm — VYayx™ — SY 2mayx™ + DY) kamx™ = 0. 
m=2 


m=2 m=1 m=0 


It may help you to write out the first few terms of each series explicitly, as in Example 3 
of Sec. 5.1; or you may continue as follows. To obtain the same general power x* in all 
four series, set m — 2 = s (thus m = s + 2) in the first series and simply write s instead 
of m in the other three series. This gives 


) 


ys (s + 2)(s + l)dsy9x* — > s(s — Lagx* — ¥ 2sagx* + y kasx* = 0. 
s=0 s=2 s=1 s=0 


1ADRIEN-MARIE LEGENDRE (1752-1833), French mathematician, who became a professor in Paris in 
1775 and made important contributions to special functions, elliptic integrals, number theory, and the calculus 
of variations. His book Eléments de géométrie (1794) became very famous and had 12 editions in less than 
30 years. 

Formulas on Legendre functions may be found in Refs. [GenRef1] and [GenRef10]. 


176 


CHAP. 5 Series Solutions of ODEs. Special Functions 


(Note that in the first series the summation begins with s = 0.) Since this equation with 
the right side 0 must be an identity in x if (2) is to be a solution of (1), the sum of the 
coefficients of each power of x on the left must be zero. Now x° occurs in the first and 
fourth series only, and gives [remember that k = n(n + 1)] 


(3a) 2+ lag + n(n + 1)ap = 0. 


x! occurs in the first, third, and fourth series and gives 


(3b) 3+ 2ag, + [-2 + n(n + I)]ay = 0. 
The higher powers x”, x,-+» occur in all four series and give 
(3c) (s + 2)(s + Ddgyo + [-s(s — 1) — 2s + n(n + 1)Ja, = 0. 


The expression in the brackets [---] can be written (n — s)(n + s + 1), as you may 
readily verify. Solving (3a) for dg and (3b) for ag as well as (3c) for ds;2, we obtain the 
general formula 


(2 = oie seo sr Ib) 
(Gin 


(4) as+2 = — 


This is called a recurrence relation or recursion formula. (Its derivation you may verify 
with your CAS.) It gives each coefficient in terms of the second one preceding it, except 
for dg and ay, which are left as arbitrary constants. We find successively 


n(n + 1) (n — 1)(n + 2) 
a i = 3! es 
_ (n — 2)(n + 3) _ (n — 3)(n + 4) 
“7 ee Ki oy 
(n — 2)n(n + 1) + 3) (n — 3)(n — 1)\n + 2)(n + 4) 
~ 4! ne 7 5! a 


and so on. By inserting these expressions for the coefficients into (2) we obtain 


(5) y(xX) = agy1 (x) + ayye(x) 
where 
n(n + 1) 4 (A — PAGO ae INCE ae 3)) 4 
(6) yx) = 1- 56° SF = 
a 4! 
= jl 2 = = Il 2 4 
7 fees (n Yn + Ms n (@ = Bie = INGE ar ZN ar 3 : 


3! 5! 
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These series converge for |x| < 1 (see Prob. 4; or they may terminate, see below). Since 
(6) contains even powers of x only, while (7) contains odd powers of x only, the ratio 
y1/y2 is not a constant, so that y, and ys are not proportional and are thus linearly 
independent solutions. Hence (5) is a general solution of (1) on the interval —-1 <x < 1. 

Note that x = +1 are the points at which | — x” = 0, so that the coefficients of the 
standardized ODE are no longer analytic. So it should not surprise you that we do not get 
a longer convergence interval of (6) and (7), unless these series terminate after finitely 
many powers. In that case, the series become polynomials. 


Polynomial Solutions. Legendre Polynomials P,,(x) 


The reduction of power series to polynomials is a great advantage because then we have 
solutions for all x, without convergence restrictions. For special functions arising as 
solutions of ODEs this happens quite frequently, leading to various important families of 
polynomials; see Refs. [GenRef1], [GenRef10] in App. 1. For Legendre’s equation this 
happens when the parameter n is a nonnegative integer because then the right side of (4) 
is zero for s = n, so that a, 49 = 0, dn+4 = 0, dyn+6 = 0,:-:. Hence if 1 is even, y1(x) 
reduces to a polynomial of degree n. If n is odd, the same is true for yo(x). These 
polynomials, multiplied by some constants, are called Legendre polynomials and are 
denoted by P,(x). The standard choice of such constants is done as follows. We choose 
the coefficient a,, of the highest power x” as 


Qn! 1+3 65 Qn= 1 


a ae n! 


(8) An (n a positive integer) 


(and a, = 1 ifn = 0). Then we calculate the other coefficients from (4), solved for a, in 
terms of a, +9, that is, 


GS De+4 1) 
M—-d)n+tse+)" 


(9) as = 


s+2 (s Sn — 2). 
The choice (8) makes p,,(1) = 1 for every n (see Fig. 107); this motivates (8). From (9) 


with s = n — 2 and (8) we obtain 


n(n — 1) n(n — 1) (2n)! 
M8 OR = ~— On = 1) Bae 


Using (2n)! = 2n(2n — 1)(2n — 2)! in the numerator and n! = n(n — 1)! and 
n! = n(n — 1)(n — 2)! in the denominator, we obtain 


n(n — 1)2n(2n — 1)(2n — 2)! 
2(2n — 1)2"n(n — 1)! n(n — 1)(n — 2)! 


an-2 


n(n — 1)2n(2n — 1) cancels, so that we get 


(2n — 2)! 
2"(n — 1)! (n — 2)! 


an-2 — 
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Similarly, 
(n — 2)(n — 3) 
Bai nay 
(2n — 4)! 


2"2! (n — 2)! (n — 4)! 
and so on, and in general, when n — 2m 2 0, 


(2n — 2m)! 


2m! (n — m)!(n — 2m)! 


(10) aAn-2m — (=1)" 


The resulting solution of Legendre’s differential equation (1) is called the Legendre 
polynomial of degree n and is denoted by P,,(x). 
From (10) we obtain 


au (2n — 2m)! 
P = Si m n—-2m 
nla) 2 } i Go 
(11) (2n)! (Qn 9) = 


= ne 
2"(n!) SG IG =o 


where M = n/2 or (n — 1)/2, whichever is an integer. The first few of these functions 
are (Fig. 107) 


Pax) = 1, P(x) = x 
(11) =B@ =3Gx? - 1), P3(x) = 4(5x3 — 3x) 


Pa(x) = §(35x* — 30x? + 3), P(x) = (63x? — 70x? + 15x) 


and so on. You may now program (11) on your CAS and calculate P,(x) as needed. 


P(x) 
1 


-l 


Fig. 107. Legendre polynomials 
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The Legendre polynomials P,,(x) are orthogonal on the interval —1 = x S 1, a basic 
property to be defined and used in making up “Fourier—Legendre series” in the chapter 


on Fourier series (see Secs. 11.5-11.6). 


PROBLEM SET 5.2 


1-5 | LEGENDRE POLYNOMIALS AND 
FUNCTIONS 
1. Legendre functions for n = 0. Show that (6) with 


5. 


n = 0 gives Po(x) = | and (7) gives (use In (1 + x) = 


1h x 


1 1 
34 a ee I . 
1 =x 


x n 
3 5 2 


Verify this by solving (1) with n = 0, setting z = y’ 
and separating variables. 


. Legendre functions for n = 1. Show that (7) with 


n = 1 gives yo(x) = Py(x) = x and (6) gives 


1 1 
=] 2 x* x8 
“ - 8 5 
1 1l+x 
=1-—--—-xlIn 
2 l= 


. Special n. Derive (11') from (11). 
. Legendre’s ODE. Verify that the polynomials in (11") 


satisfy (1). 
Obtain Pg and Py. 


6-9 


CAS PROBLEMS 


6. 


7. 


8. 


9. 


10. 


Graph Po(x),--:, Pio(x) on common axes. For what x 
(approximately) and n = 2,---, 10 is |P,.00)| <= 4? 
From what n on will your CAS no longer produce 
faithful graphs of P,(x)? Why? 

Graph Qo(x), Qi(x), and some further Legendre 
functions. 

Substitute agx® + dg41x5*t + asyox°*? into Legen- 
dre’s equation and obtain the coefficient recursion (4). 
TEAM PROJECT. Generating Functions. Generating 
functions play a significant role in modern applied 
mathematics (see [GenRef5]). The idea is simple. If we 
want to study a certain sequence ( f,,(x)) and can find a 
function 


Gu) = >) frou", 
n=0 
we may obtain properties of (f,(x)) from those of G, 
which “generates” this sequence and is called a 
generating function of the sequence. 


(a) Legendre polynomials. Show that 


1 oo 
(12) Gu, x) = = Py(xu”™ 
: V1 — 2xu + u? = ag 


is a generating function of the Legendre polynomials. 
Hint: Start from the binomial expansion of 1/V/1 — v, 
then set v = 2xu — u?, multiply the powers of 2xu — u2 
out, collect all the terms involving wu”, and verify that 
the sum of these terms is P,(x)u”. 


(b) Potential theory. Let A; and Ag be two points in 
space (Fig. 108, re > 0). Using (12), show that 


1 1 


Vr + re — 2ryrecos 0 
1 <= ay 
= > > P,,(cos 0) 7%) 
m=0 


This formula has applications in potential theory. (Q/r 
is the electrostatic potential at Ag due to a charge Q 
located at A,. And the series expresses 1/r in terms of 
the distances of A, and Ag from any origin O and the 
angle @ between the segments OA, and OAg.) 


Fig. 108. Team Project 10 


(c) Further applications of (12). Show that 
P,Q) = 1, Pa(-1) = (- 1)", Pan +10) = 0, and 
Pon (0) = (-1)"+1-3---(Qn — 1)/[2- 4--- (Qn). 


11-15 


FURTHER FORMULAS 


11. 


12. 


ODE. Find a solution of (a? — x2)y” — 2xy’ + 

n(n + 1)y = 0, a # 0, by reduction to the Legendre 

equation. 

Rodrigues’s formula (13)? Applying the binomial 

theorem to (x2 — 1)”, differentiating it n times term 

by term, and comparing the result with (11), show that 
Lo @ 


1 
2 
DON chek eee. 


(13) P,(x) = 


2OLINDE RODRIGUES (1794-1851), French mathematician and economist. 
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13. Rodrigues’s formula. Obtain (11') from (13). 15. Associated Legendre functions P* (x) are needed, e.g., 


i t hysics. Th defined b 
14. Bonnet’s recursion.* Differentiating (13) with Sa ea ae ea eerie 


respect to u, using (13) in the resulting formula, and 


k 
x2k/2 d"Dy(x) 


comparing coefficients of u™, obtain the Bonnet (15) PK) = (1 

recursion. dx" 

(14) (2 + I)Py 4100) = n + 1)XP, (x) — npy—1@), and are solutions of the ODE 

where n = 1, 2,---. This formula is useful for com- (16) (1 — x?)y" — 2xy’ + gy = 0 

putations, the loss of significant digits being small 

(except near zeros). Try (14) out for a few computations where g(x) = n(n + 1) k/( x2), Find Ph), 
of your own choice. P 4x), P. 3(x), and Pix) and verify that they satisfy (16). 


5.3 Extended Power Series Method: 
Frobenius Method 


Several second-order ODEs of considerable practical importance—the famous Bessel 
equation among them—have coefficients that are not analytic (definition in Sec. 5.1), but 
are “not too bad,” so that these ODEs can still be solved by series (power series times a 
logarithm or times a fractional power of x, etc.). Indeed, the following theorem permits 
an extension of the power series method. The new method is called the Frobenius 
method.* Both methods, that is, the power series method and the Frobenius method, have 
gained in significance due to the use of software in actual calculations. 


THEOREM = 1 Frobenius Method 
Let b(x) and c(x) be any functions that are analytic at x = 0. Then the ODE 


D(x) , — cx) 
+ y + =o) = 0 


(1) y” 


xX 


has at least one solution that can be represented in the form 


(2) y(x) = x" > Amx™ = x"(ag + ayx + agx7 +--+) (dg # 0) 


m=0 


where the exponent r may be any (real or complex) number (and r is chosen so that 
ao # 0). 

The ODE (1) also has a second solution (such that these two solutions are linearly 
independent) that may be similar to (2) (with a different r and different coefficients) 
or may contain a logarithmic term. (Details in Theorem 2 below.) 


30SSIAN BONNET (1819-1892), French mathematician, whose main work was in differential geometry. 

4GEORG FROBENIUS (1849-1917), German mathematician, professor at ETH Zurich and University of Berlin, 
student of Karl Weierstrass (see footnote, Sect. 15.5). He is also known for his work on matrices and in group theory. 

In this theorem we may replace x by x — xg with any number xg. The condition ag # 0 is no restriction; it 
simply means that we factor out the highest possible power of x. 

The singular point of (1) at x = 0 is often called a regular singular point, a term confusing to the student, 
which we shall not use. 
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For example, Bessel’s equation (to be discussed in the next section) 


1 2_ ,,2 
y toy + (: 5 )y =0 (va parameter) 
Xx x 


is of the form (1) with b(x) = 1 and c(x) = x2 — analytic at x = 0, so that the theorem 
applies. This ODE could not be handled in full generality by the power series method. 

Similarly, the so-called hypergeometric differential equation (see Problem Set 5.3) also 
requires the Frobenius method. 

The point is that in (2) we have a power series times a single power of x whose exponent 
r is not restricted to be a nonnegative integer. (The latter restriction would make the whole 
expression a power series, by definition; see Sec. 5.1.) 

The proof of the theorem requires advanced methods of complex analysis and can be 
found in Ref. [A11] listed in App. 1. 


Regular and Singular Points. The following terms are practical and commonly used. 
A regular point of the ODE 


y" + p@y’ + qawy = 0 


is a point xg at which the coefficients p and q are analytic. Similarly, a regular point of 
the ODE 


hey" + POdy'@) + Gay = 0 
is an Xo at which h, P.q are analytic and h(xo) # 0 (so what we can divide by h and get 


the previous standard form). Then the power series method can be applied. If xo is not a 
regular point, it is called a singular point. 


Indicial Equation, Indicating the Form of Solutions 


We shall now explain the Frobenius method for solving (1). Multiplication of (1) by ie 
gives the more convenient form 


(1’) tae + xb(x)y’ + cQx)y = 0. 


We first expand b(x) and c(x) in power series, 
D(x) = bo + byx + byx2 +++, c(x) = cg + Cyx + cox® + °° 


or we do nothing if b(x) and c(x) are polynomials. Then we differentiate (2) term by term, 


finding 
y' (x) = by (m + ink - x" Trao +(r+ Dayxt+-::] 
m=0 
(2*) yY@= DS mt nt r- Vans ™" 
m=0 


lr — Dag + + Irae +>], 
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By inserting all these series into (1') we obtain 
y g 


(3) x"[r(r — Lag + +++] + (bo + bax +++) x (rag + °°") 
+ (co + eux +°°+)x"(ag + yx +-:-) = 0. 


We now equate the sum of the coefficients of each power x", x’~ 1 x"? ... to zero. This 
yields a system of equations involving the unknown coefficients a,,. The smallest power 
is x” and the corresponding equation is 


[r(r a 1) + bor aid Co ldo = 0. 


Since by assumption dg # 0, the expression in the brackets [---] must be zero. This 
gives 


(4) r(r — 1) + bor + co = O. 


This important quadratic equation is called the indicial equation of the ODE (1). Its role 
is as follows. 

The Frobenius method yields a basis of solutions. One of the two solutions will always 
be of the form (2), where r is a root of (4). The other solution will be of a form indicated 
by the indicial equation. There are three cases: 


Case 1. Distinct roots not differing by an integer 1, 2, 3,---. 
Case 2. A double root. 
Case 3. Roots differing by an integer 1, 2, 3,---. 


Cases | and 2 are not unexpected because of the Euler-Cauchy equation (Sec. 2.5), the 
simplest ODE of the form (1). Case | includes complex conjugate roots ry and rg = ry 
because ry — ro = ry — ry = 2i Im ry is imaginary, so it cannot be a real integer. The 
form of a basis will be given in Theorem 2 (which is proved in App. 4), without a general 
theory of convergence, but convergence of the occurring series can be tested in each 
individual case as usual. Note that in Case 2 we must have a logarithm, whereas in Case 3 
we may or may not. 


Frobenius Method. Basis of Solutions. Three Cases 


Suppose that the ODE (1) satisfies the assumptions in Theorem |. Let ry and re be 
the roots of the indicial equation (4). Then we have the following three cases. 


Case 1. Distinct Roots Not Differing by an Integer. A basis is 


(5) yi) = x" (dg F ax + yx" + see) 
and 
(6) yolx) = x"(Ag + Ayx + Agx? + ++) 


with coefficients obtained successively from (3) with r = ry andr = re, respectively. 
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EXAMPLE 1 


EXAMPLE 2 


Case 2. Double Root r, = rg =r. A basis is 

7) yt) = x"(aq + ayx + agx® + ++) [r= 3(1 — b9)] 
(of the same general form as before) and 

(8) yalx) = yr(x) Inx + x"(Ayx + Agx? + ++) (x > 0). 
Case 3. Roots Differing by an Integer. A basis is 

(9) yy(x) = x™(ag + ayx + agx? +++) 

(of the same general form as before) and 

(10) yo(x) = kyy(x) Inx + x"(Ag + Ayx + Aax* aos Fe 


where the roots are so denoted that ry — re > 0 and k may turn out to be zero. 


Typical Applications 


Technically, the Frobenius method is similar to the power series method, once the roots 
of the indicial equation have been determined. However, (5)-(10) merely indicate the 
general form of a basis, and a second solution can often be obtained more rapidly by 
reduction of order (Sec. 2.1). 


Euler—Cauchy Equation, Illustrating Cases 1 and 2 and Case 3 without a Logarithm 
For the Euler-Cauchy equation (Sec. 2.5) 
xy" + boxy’ + coy = 0 (bo, Co constant) 
substitution of y = x” gives the auxiliary equation 
rir — 1) + bor + co = 0, 
which is the indicial equation [and y = x” is a very special form of (2)!]. For different roots rj, ra we get a basis 


yy = x"!, yg = x"?, and for a double root r we get a basis x", x" In x. Accordingly, for this simple ODE, Case 3 
plays no extra role. ia] 


Illustration of Case 2 (Double Root) 
Solve the ODE 


(11) x(x — ly” + Gx -— Dy’ ty =0. 


(This is a special hypergeometric equation, as we shall see in the problem set.) 


Solution. Writing (11) in the standard form (1), we see that it satisfies the assumptions in Theorem 1. [What 
are b(x) and c(x) in (11)?] By inserting (2) and its derivatives (2*) into (11) we obtain 


Smt nan + r= VYamx™*" — YS (m+ om + 7 = Vax 
m=0 m= 


(12) 


+335 (m+ nayx" — dS m+ Nay x ttt + SY anx™*" = 0. 
m=0 


m=0 m=0 
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The smallest power is x occurring in the second and the fourth series; by equating the sum of its coefficients 
to zero we have 


[-r(r — 1) — rlap = 0, thus r? = 0. 


Hence this indicial equation has the double root r = 0. 


First Solution. We insert this value r = 0 into (12) and equate the sum of the coefficients of the power 
x* to zero, obtaining 


(s + 1)saga44 4 


s(s — lds 3sas — (s + l)dg44 + ds = 0 


thus ds. 1 = ds. Hence dg = ay = dg = ---, and by choosing dg = | we obtain the solution 


1 


1-x 


yi) = Sx" = (xl <1). 
0 


m= 


Second Solution. We get a second independent solution yz by the method of reduction of order (Sec. 2.1), 
substituting yo = uy, and its derivatives into the equation. This leads to (9), Sec. 2.1, which we shall use in 
this example, instead of starting reduction of order from scratch (as we shall do in the next example). In (9) of 
Sec. 2.1 we have p = (3x — L/(? — x), the coefficient of y’ in (11) in standard form. By partial fractions, 


d sx= 1 | 2 ; 
BGs xa — 1)“ x-1/ 


Hence (9), Sec. 2.1, becomes 


1 
t) dx 2In(x — 1) — Inx. 


eo 1)? 1 Inx 


2,-sp dx = = 
e . u yo = UY, = 
(—- Dx x an 


ue =U=yy 


y, and yg are shown in Fig. 109. These functions are linearly independent and thus form a basis on the interval 
0 <x < 1 (as well as on 1 < x < %), 


MO WwW BR 
T 


Fig. 109. Solutions in Example 2 


Case 3, Second Solution with Logarithmic Term 


Solve the ODE 


(13) (x? — xy" —xy' +y=0. 


Solution. Substituting (2) and (2*) into (13), we have 


(x? — x) Dy (m+ nin t+ r—- Danx™t™? — x Dy (m+ Nanx™*T 1 + »y ape 0; 


m=0 m=0 m=0 
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We now take x”, x, and x inside the summations and collect all terms with power x”"*" and simplify algebraically, 


> (m+ r—- 1)? ayx™*" > (m+ n(m+r— Danx™t" 1 = 0. 


m=0 m=0 


In the first series we set m = s and in the second m = s + 1, thus s = m — 1. Then 


(14) DG +r 1) 2a,x5** >> (s bp 1\(s f rdg.4x°" =0. 
s=0 s=-1 
The lowest power is x"! (take 5 = —1 in the second series) and gives the indicial equation 
rr— 1) =0. 


The roots are 7; = | and rg = 0. They differ by an integer. This is Case 3. 


First Solution. From (14) with r = r, = 1 we have 


[sas — (s + 2s + Dasy]x8*? = 0. 


Ms 


s=0 
This gives the recurrence relation 
s 
ast a EF” (s =.0; 1,232) 
Hence a, = 0, dg = 0,:-+ successively. Taking ag = 1, we get as a first solution yy = x"ag = x. 


Second Solution. Applying reduction of order (Sec. 2.1), we substitute yg = yyw = xu, yg = xu’ + u and 
ys = xu" + 2u' into the ODE, obtaining 


(x? xy(xu" + 2u') — x(xu’ + uv) + xu = 0. 


xu drops out. Division by x and simplification give 


(x? xu” + (x — 2)’ = 0. 


From this, using partial fractions and integrating (taking the integration constant zero), we get 


eI 


x2 


u" RD, 2 1 ‘ 
7 z + : Inu =I1n 
u 5 lee x |=—z 


u z > uw=Inx+-, yo = xu =xInx +1. 


yy and yo are linearly independent, and yg has a logarithmic term. Hence y, and yg constitute a basis of solutions 
for positive x. 5] 


The Frobenius method solves the hypergeometric equation, whose solutions include 
many known functions as special cases (see the problem set). In the next section we use 
the method for solving Bessel’s equation. 
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PROBLEM SET 5-3 


1. WRITING PROJECT. Power Series Method and of (15) [see the small sample of elementary functions 
Frobenius Method. Write a report of 2-3 pages in part (c)]. This accounts for the importance of (15). 
explaining the difference between the two methods. No (a) Hypergeometric series and function. Show that 
proofs. Give simple examples of your own. the indicial equation of (15) has the roots ry = 0 and 


ro = 1—c. Show that for ry = 0 the Frobenius 


21) FROGENIS MELHOR method gives (16). Motivate the name for (16) by 


Find a basis of solutions by the Frobenius method. Try to showing that 

identify the series as expansions of known functions. Show 

the details of your work. F(1, 1, 1:x) = FU, b, b; 9) = F(a, 1, a;x) = ; 
2. (x + 2)2y” +r + Dy’ —y=0 1—x 


. xy” + 2y’ + xy =0 (b) Convergence. For what a or b will (16) reduce to 
waxy’ +y=0 a polynomial? Show that for any other a, b, c 
xy” + Ox + Dy’ + & + Dy =0 (c # 0, —1, —2,---) the series (16) converges when 


4 

4 

5 

by” $2 y + Ge? = Dp = 0 alt 
7 

8 

9 


+(x — Dy =0 (c) Special cases. Show that 
-y x-Dy= 


xy” +y' —xy =0 (1 + x)” = F(—n, b, b; —x), 
i 2x(x Ly” (x ly’ n y= 0 d x)” =1 hil . 1, 2: x), 
10. xy” + 2y’ + 4xy = 0 arctan x = xF(5, 1,5; —x*) 
11. xy” + (2 — wy’ + — Dy =0 arcsin x = xF(Q, 5,33 x”), 
2. xy" + 6xy’ + (4x2 + Oy =0 In(1 + x) = xF(1, 1, 2; —x), 
13. xy” + (1 — 2x)y’ + (@- Dy =0 ie are ey 
14. TEAM PROJECT. Hypergeometric Equation, Series, Les 
and Function. Gauss’s hypergeometric ODE? is Find more such relations from the literature on special 
functions, for instance, from [GenRef1] in App. 1. 
(15) x(1 — xy" + [c — (a+ b + Lxly’ — aby = 0. (d) Second solution. Show that for ro = 1 — c the 
Frobenius method yields the following solution (where 
Here, a, b, c are constants. This ODE is of the form c # 2,3,4,°"+): 
pay” + pry’ + poy = 0, where po, px, po are polyno- ac (a—c+1b-ct+1) 
mials of degree 2, 1, 0, respectively. These polynomials ya(x) = x 14 tea a x 
are written so that the series solution takes a most prac- (17) ; 
tical form, namely, (a—ct+la@-—ct+Db—-ct+Ib-cet+2 , 
t x 
'(—c¢ =, 
oy = 14 Hoy 4 Mat DOD + 1) 2 a 
_ T T x 
v1 ie tele + 1h foe 
(16) 
ala + 1)(a + 2)b(b + 1)(b+ 2) 3 Show that 
: 3! cle + 1)(e + 2) es yo(x) = xt-SF(a -—c + 1,b-c4+1,2 — 3%). 
This series is called the hypergeometric series. Its sum (e) On the generality of the hypergeometric equation. 
y1(x) is called the hypergeometric function and is Show that 


denoted by F(a, b, c; x). Here, c # 0, —1, —2,---. By 
choosing specific values of a, b, c we can obtain an 
incredibly large number of special functions as solutions 


(18) (2 + At + B)¥ + (Ct + D)v + Ky =0 


®CARL FRIEDRICH GAUSS (1777-1855), great German mathematician. He already made the first of his great 
discoveries as a student at Helmstedt and Gottingen. In 1807 he became a professor and director of the Observatory 
at G6ttingen. His work was of basic importance in algebra, number theory, differential equations, differential 
geometry, non-Euclidean geometry, complex analysis, numeric analysis, astronomy, geodesy, electromagnetism, 
and theoretical mechanics. He also paved the way for a general and systematic use of complex numbers. 
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with y = dy/dt, etc., constant A, B, C, D, K, and t2 + 15-20 | HYPERGEOMETRIC ODE 
At + B= (t — ty)(t — ta), 11 # ta, can be reduced to Find a general solution in terms of hypergeometric 


the hypergeometric equation with independent variable functions. 
one 15. 2x(1 — wy" — (1 + 6x)y’ — 2y = 0 
aaa eae 16. x(1 — xy" + & + 2wy'’ — 2y =0 
17. 4x(1 — xy” + y’ + 8y =0 


and parameters related by Ct, + D = —c(te — 13), 
C=a+b+1, K = ab. From this you see that (15) 18. 4(? — 3 + 2)y — 29+ y =0 

is a “normalized form” of the more general (18) and 2 a, ‘ _ 
that various cases of (18) can thus be solved in terms 19. 2¢ St + O)y + (2t — 3)y— By = 0 
of hypergeometric functions. 20. 31 + NV+ Hh-y=0 


5.4 Bessel’s Equation. Bessel Functions J,(x) 


One of the most important ODEs in applied mathematics in Bessel’s equation,° 
(1) xy" + xy’ + (x? — vy =0 


where the parameter v (nu) is a given real number which is positive or zero. Bessel’s 
equation often appears if a problem shows cylindrical symmetry, for example, as the 
membranes in Sec.12.9. The equation satisfies the assumptions of Theorem 1. To see this, 
divide (1) by x” to get the standard form y” + y’/x + (1 — v?/x?)y = 0. Hence, according 
to the Frobenius theory, it has a solution of the form 


(2) VO) Seane (do # 0). 
m=0 
Substituting (2) and its first and second derivatives into Bessel’s equation, we obtain 


> (m+ rm +r —- l)dyx™" + 2 (m + namx*" 


m=0 m=0 


co oo 

+r+ + 

+ aa ar ae Amx"™*" = 0. 
m=0 m=0 


s+r Yr 


We equate the sum of the coefficients of x to zero. Note that this power x** 
corresponds to m = s in the first, second, and fourth series, and to m = s — 2 in the third 
series. Hence for s = 0 and s = 1, the third series does not contribute since m 2 0. 


SERIEDRICH WILHELM BESSEL (1784-1846), German astronomer and mathematician, studied astronomy 
on his own in his spare time as an apprentice of a trade company and finally became director of the new KGnigsberg 
Observatory. 

Formulas on Bessel functions are contained in Ref. [GenRef10] and the standard treatise [A13]. 


188 


CHAP. 5 Series Solutions of ODEs. Special Functions 


For s = 2,3,--- all four series contribute, so that we get a general formula for all these s. 
We find 

(a) rir — 1)ag + rag — ig =0 (s = 0) 
(3) (b) (r+ Ira, + (r + lay — va, = 0 (c= 1) 


(c) (strys +r- lag + (8 + Nay + ds-g — vs =0 (s = 2,3,-:-). 
From (3a) we obtain the indicial equation by dropping ao, 
(4) (r + vr — v) = 0. 
The roots are ry = v (= 0) and rg = —v. 
Coefficient Recursion for r = r, = v. For r = v, Eq. (3b) reduces to (2v + l)a, = 0. 
Hence a, = 0 since v 2 0. Substituting r = v in (3c) and combining the three terms 
containing a, gives simply 
(5) (s + 2v)sa, + dg_2 = 0. 


Since a, = 0 and v = 0, it follows from (5) that ag = 0, a5 = 0,---. Hence we have to 
deal only with even-numbered coefficients a, with s = 2m. For s = 2m, Eq. (5) becomes 


(2m + 2v)2mdom, + dam—2 = 0. 


Solving for dg, gives the recursion formula 


1 
(6) a ma 1,2, 
2°m(v + m) 
From (6) we can now determine dg, a4,:+- successively. This gives 
do 
22(p + 1) 
a2 do 


a 2 4 
2y+2) 242I0~4+ D+ 2) 


and so on, and in general 


(—1)"ag 


2 ae es 


(7) adam m= 1,2,°°:. 


Bessel Functions J,,(x) for Integer v = n 
Integer values of v are denoted by n. This is standard. For v = n the relation (7) becomes 
(—1)""ao 


32) (n+ In + 2)--(n +m) 


(8) dom m = 1,2,-:- 
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EXAMPLE-1 


do is still arbitrary, so that the series (2) with these coefficients would contain this arbitrary 
factor do. This would be a highly impractical situation for developing formulas or 
computing values of this new function. Accordingly, we have to make a choice. The choice 
dao = 1 would be possible. A simpler series (2) could be obtained if we could absorb the 
growing product (n + 1)(n + 2)-:-(m + m) into a factorial function (n + m)! What 
should be our choice? Our choice should be 


1 
(9) do = 


2") 
because then n! (n + 1)-:-(n + m) = (n + m)! in (8), so that (8) simply becomes 


(-1)™ 


22M"! (n + m)! 


(10) tom = ee 


By inserting these coefficients into (2) and remembering that cj = 0, cz = 0,--- we obtain 
a particular solution of Bessel’s equation that is denoted by J,,(x): 


es) (-1) x?" 
11 Tea ok = 0). 
a me a 22+! (n + m)! —— 


Jy(x) is called the Bessel function of the first kind of order n. The series (11) converges 
for all x, as the ratio test shows. Hence J;,(x) is defined for all x. The series converges 
very rapidly because of the factorials in the denominator. 


Bessel Functions Jo(x) and J,(x) 


For n = 0 we obtain from (11) the Bessel function of order 0 


C7 (- 1) ec x2 xt x8 
my) Jowo= & 2m 2 en bt Rae ee 
paw Cece) 25) 272!) 2568) 
which looks similar to a cosine (Fig. 110). For = 1 we obtain the Bessel function of order 1 
Cy (jee x x? x? x? 


(13) A@= > pe econ 
OI aiicaion AU (7, oa ) iY 0) re) PL 


which looks similar to a sine (Fig. 110). But the zeros of these functions are not completely regularly spaced 
(see also Table Al in App. 5) and the height of the “waves” decreases with increasing x. Heuristically, n?/x? 
in (1) in standard form [(1) divided by x7] is zero (if n = 0) or small in absolute value for large x, and so is 
y'/x, so that then Bessel’s equation comes close to y” + y = 0, the equation of cos x and sin x; also y’/x acts 
as a “damping term,” in part responsible for the decrease in height. One can show that for large x, 


(14) WP C4 leer = cos (: > =) 


where ~ is read “asymptotically equal” and means that for fixed n the quotient of the two sides approaches | 
as x > %, 
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(TTT tri 


Fig. 110. Bessel functions of the first kind J, and J, 


Formula (14) is surprisingly accurate even for smaller x (>0). For instance, it will give you good starting 
values in a computer program for the basic task of computing zeros. For example, for the first three zeros of Jo 
you obtain the values 2.356 (2.405 exact to 3 decimals, error 0.049), 5.498 (5.520, error 0.022), 8.639 (8.654, 
error 0.015), etc. i 


Bessel Functions J,(x) for any v = 0. Gamma Function 


We now proceed from integer v = n to any v = 0. We had ap = 1/(2”n!) in (9). So we 
have to extend the factorial function n! to any v = O. For this we choose 


1 


(15) 0 STi 1 


with the gamma function ['(v + 1) defined by 


oo) 


(16) Tw += | et” dt Ceo 


0 


(CAUTION! Note the convention v + | on the left but v in the integral.) Integration 
by parts gives 
Tw t+ 1) = -e7t?| + v| et’ tdt=0+4+ I). 
0 
0 


This is the basic functional relation of the gamma function 
(17) Tw + 1) = v0). 


Now from (16) with v = 0 and then by (17) we obtain 
r= | e 'dt=-—e*| =0-(-l=1 
0 0 


and then [(2) = 1 - Pd.) = 1!, (3) = 21) = 2! and in general 


(18) Tin + 1) =n! (n = 0,1,-°*). 
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THEOREM 1 


Hence the gamma function generalizes the factorial function to arbitrary positive v. 
Thus (15) with v = n agrees with (9). 
Furthermore, from (7) with dg given by (15) we first have 
So a 
22m! (v + 1) + 2)--- + m2T + 1) 


adam 


Now (17) gives (v + DPQ + 1) =T@ +t 2), (+ 2)P@ + 2) = FP + 3) and so on, 
so that 


(v + lw + 2):---@t+t mo + 1)=TWt+m + 1). 
Hence because of our (standard!) choice (15) of do the coefficients (7) are simply 


(=1)" 
22™+vmI Tv + m+ 1) 


(19) adam = 


With these coefficients and r = r, = v we get from (2) a particular solution of (1), denoted 
by J,(x) and given by 


(20) I(x) =x" > ae : 
pam e *m! T+ m+ 1) 


J,(x) is called the Bessel function of the first kind of order v. The series (20) converges 
for all x, as one can verify by the ratio test. 


Discovery of Properties from Series 


Bessel functions are a model case for showing how to discover properties and relations of 
functions from series by which they are defined. Bessel functions satisfy an incredibly large 
number of relationships—look at Ref. [A13] in App. 1; also, find out what your CAS knows. 
In Theorem 3 we shall discuss four formulas that are backbones in applications and theory. 


Derivatives, Recursions 


The derivative of J,(x) with respect to x can be expressed by J,,_4(x) or Jy,44(x) by 
the formulas 


@) LG) =2’LAG) 


(b) [x *Y,@))' = —x "Y,41G). 


(21) 


Furthermore, J,(x) and its derivative satisfy the recurrence relations 


2v 
(21) (ce) Jy-1) + Jy410) = Jv) 


@) a) — SG) = 20.0). 
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PROOF 


EXAMPLE 2 
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(a) We multiply (20) by x” and take x?” under the summation sign. Then we have 


oo en 


x" F(x) = By 


m=0 


22™vmt Tv + m+ 1) 


y= 


We now differentiate this, cancel a factor 2, pull xv} out, and use the functional 
relationship [(v + m + 1) = (v + m)['(v + m) [see (17)]. Then (20) with v — 1 instead 
of v shows that we obtain the right side of (21a). Indeed, 


(- 16 ar ea 


22m lint Py + m)- 


Ps (-1)™2(m +4 moe 


(xJ,)' = 
2 2m Tv + m+ 1) 


m=0 


= vw-I~N 
XX > 


m=0 


(b) Similarly, we multiply (20) by x~”, so that x” in (20) cancels. Then we differentiate, 
cancel 2m, and use m! = m(m — 1)!. This gives, with m = 5 + 1, 


aL) = > = 


oo (=17"-? « 1p", "* 
g2mtv-lig — IT tmtl) 2 


= Ay 2B8**ASI Ty + 5 + 2) 


Equation (20) with v + 1 instead of v and s instead of m shows that the expression on 
the right is —x~ °J,,,4(x). This proves (21b). 

(c), (d) We perform the differentiation in (21a). Then we do the same in (21b) and 
multiply the result on both sides by x2”. This gives 


(a*) pe FH Tey 
(b*)  —vx’ ty, + x = -x 41. 


Substracting (b*) from (a*) and dividing the result by x” gives (21c). Adding (a*) and 
(b*) and dividing the result by x” gives (21d). | 


Application of Theorem 1 in Evaluation and Integration 


Formula (21c) can be used recursively in the form 
2v 
Fy) = > Iv) = Sy—100 


for calculating Bessel functions of higher order from those of lower order. For instance, Jo(x) = 2J4(x)/x — Jo(x), 
so that Jz can be obtained from tables of Jo and J; (in App. 5 or, more accurately, in Ref. [GenRef1] in App. 1). 

To illustrate how Theorem | helps in integration, we use (21b) with v = 3 integrated on both sides. This 
evaluates, for instance, the integral 


2 2 
1 
f= | x 3Ja(x) dx = —x3Jg(x)| = 3/32) + Ja(1). 
1 un 


A table of Js (on p. 398 of Ref. [GenRef1]) or your CAS will give you 
— - 0.128943 + 0.019563 = 0.003445. 


Your CAS (or a human computer in precomputer times) obtains J3 from (21), first using (21c) with v = 2, 
that is, Jg = 4x7 Ty — Jy, then (21c) with v = 1, that is, Jo = 2x, — Jo. Together, 
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EXAMPLE 3 


2 
T= x73(4x 1x7 — Jo) — Ji) 
1 


§[2J4(2) — 2o(2) — Ja(2)] + [84(1) — 4Jo(1) — (1) 


g/1(2) + aJo(2) + Ta(1) — 4Jo(1). 


This is what you get, for instance, with Maple if you type int(---). And if you type evalf(int(---)), you obtain 
0.003445448, in agreement with the result near the beginning of the example. 


Bessel Functions J,, with Half-Integer v Are Elementary 


We discover this remarkable fact as another property obtained from the series (20) and 
confirm it in the problem set by using Bessel’s ODE. 


Elementary Bessel Functions J, with vy = +3, +3, +3,---. The Value I'(3) 


We first prove (Fig. 111) 


ie {2 
(22) (a) Jiy2) = a in Ki (b)  J-1/2(0) = Fe 0O8*: 


The series (20) with vp = 3 is 


_anpe pn 2a _Cp™en 
Jijax) = Vx > 22! Pon + 3) : p2 72 


m+1 3)° 
! 3 
m=0 m=0 ge Pm - 2) 


The denominator can be written as a product AB, where (use (16) in B) 


A =2™m! = 2m(2m — 2)(Qm — 4)--- 4-2, 
B= 2" Pom + 3) = 2"*'(m + 3)0m — 3) 3 - BT) 
= (2m + 1)2m — 1)+::3-1- V7; 


here we used (proof below) 
(23) T@) = Vor. 


The product of the right sides of A and B can be written 


AB = (2m + 1)2m(m — 1) +++ 3+ 2+ 1V a = 2m + DIV. 


2 oo (- 1y™%2r4 2 : 
Jijalx) \) 7x >> (2m + 1)! a 0% 
m=0 


Hence 


oO 


Qn An 6n x 


Fig. 111. Bessel functions J;/2 and J_1/2 
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This proves (22a). Differentiation and the use of (21a) with v = Z now gives 


2 
[VxJijo(x)]' = [Zoos = x2 J—1/2(x). 


This proves (22b). From (22) follow further formulas successively by (21c), used as in Example 2. 
We finally prove I) = V7 by a standard trick worth remembering. In (15) we set t = u?. Then 
dt = 2u du and 


We square on both sides, write v instead of u in the second integral, and then write the product of the integrals 


as a double integral: 
1 ane ar a ih bd 
r(4) = 4| en du | er a= a| | e+) dy du. 


0 0 0-0 


We now use polar coordinates r, @ by setting u = rcos 0, v = r sin 0. Then the element of area is du du = r dr d0 
and we have to integrate over r from 0 to © and over @ from 0 to 77/2 (that is, over the first quadrant of the 
uv-plane): 


By taking the square root on both sides we obtain (23). fei] 


General Solution. Linear Dependence 


For a general solution of Bessel’s equation (1) in addition to J,, we need a second linearly 
independent solution. For v not an integer this is easy. Replacing v by —v in (20), we 
have 


Cd (-1)™x2™ 


24 ILw@=s" 
ae @)=x 2 22 Ym! Tim — v + 1) 


m=0 


Since Bessel’s equation involves v”, the functions J, and J_,, are solutions of the equation 
for the same v. If v is not an integer, they are linearly independent, because the first terms 
in (20) and in (24) are finite nonzero multiples of x” and x~”. Thus, if v is not an integer, 
a general solution of Bessel’s equation for all x # 0 is 


y(x) = cy J(x) + Cod _,(X) 
This cannot be the general solution for an integer v = n because, in that case, we have 
linear dependence. It can be seen that the first terms in (20) and (24) are finite nonzero 


multiples of x” and x~”, respectively. This means that, for any integer v = n, we have 
linear dependence because 


(25) t= (=) FG) (n = 1,2,---). 
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PROOF To prove (25), we use (24) and let v approach a positive integer n. Then the gamma 
function in the coefficients of the first n terms becomes infinite (see Fig. 553 in App. 


A3.1), the coefficients become zero, and the summation starts with m = n. Since in 


this case [(m — n + 1) = (m — n)! by (18), we obtain 


co (-1)™%2"—-"” 


(- re 


26) Ja@= > ae 


mn 


mi(m—n)! 225 275*" (n + 5)! 5! 


(m=nts). 


The last series represents (— 1)"J,,(x), as you can see from (11) with m replaced by s. This 


completes the proof. 


The difficulty caused by (25) will be overcome in the next section by introducing further 
Bessel functions, called of the second kind and denoted by Y,. 


1. Convergence. Show that the series (11) converges for 
all x. Why is the convergence very rapid? 


ODEs REDUCIBLE TO BESSEL’S ODE 


This is just a sample of such ODEs; some more follow in 
the next problem set. Find a general solution in terms of J, 
and J_,, or indicate when this is not possible. Use the 
indicated substitutions. Show the details of your work. 


a9) =0 
3.xy" ty +hy=0 (Ve=9 
4.y" +(e" -— Hy =0 ( *=2) 
5. Two-parameter ODE 
xry” xy’ (A2x2 vy =(0 (Ax =z) 
6. x°y" +4043 y=0 (y=uvx, Ve =2 


2. x7y" + xy! + (x? 


7. x?y" + xy’ + $07 - Dy =0 (x = 22) 
8. (2x + 1)2y” + 22x + Dy’ + 1l6x(x + Dy = 0 
(Qx+1=2) 


9. xy” + Qv 4+ ly’ t+axy=0 (Cy =x7'n) 

10. x?y" + (1 = 2v)xy’ + V2?” +: 1 - Dy = 0 
(y = x"u, x" = z) 

11. CAS EXPERIMENT. Change of Coefficient. Find 
and graph (on common axes) the solutions of 


y” + kx ty’ + y = 0, (0) = 1, y'(0) = 0, 


for k = 0,1, 2,---,10 (or as far as you get useful 
graphs). For what k do you get elementary functions? 
Why? Try for noninteger k, particularly between 0 and 2, 
to see the continuous change of the curve. Describe the 
change of the location of the zeros and of the extrema as 
k increases from 0. Can you interpret the ODE as a model 
in mechanics, thereby explaining your observations? 


12. CAS EXPERIMENT. Bessel Functions for Large x. 


(a) Graph J,(x) for n = 0,---,5 on common axes. 


PROBLEM SET 5.4 


(b) Experiment with (14) for integer n. Using graphs, 
find out from which x = x, on the curves of (11) 
and (14) practically coincide. How does x, change 
with n? 

(c) What happens in (b) ifn = +19 (Our usual notation 
in this case would be v.) 

(d) How does the error of (14) behave as a func- 
tion of x for fixed n? [Error = exact value minus 
approximation (14).] 

(e) Show from the graphs that Jo(x) has extrema where 
J1(x) = 0. Which formula proves this? Find further 
relations between zeros and extrema. 


13-15 
modeling (e.g. of vibrations; see Sec. 12.9). 


ZEROS of Bessel functions play a key role in 


13. Interlacing of zeros. Using (21) and Rolle’s theorem, 
show that between any two consecutive positive zeros 
of J,(x) there is precisely one zero of Jn, +1(x). 


14. Zeros. Compute the first four positive zeros of Jo(x) 
and J1(x) from (14). Determine the error and comment. 


15. Interlacing of zeros. Using (21) and Rolle’s theorem, 
show that between any two consecutive zeros of Jo(x) 
there is precisely one zero of Jy(x). 


HALF-INTEGER PARAMETER: APPROACH 
BY THE ODE 


16. Elimination of first derivative. Show that y = uv 
with v(x) = exp (3 J p(x) dx) gives from the ODE 
y” + poy’ + q@y = 0 the ODE 


u” + [qx — kp 


16-18 


3 p'(x~)] u = 0, 


not containing the first derivative of u. 
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17. Bessel’s equation. Show that for (1) the substitution 21. Basic integral formula. Show that 


in Prob. 16 is y = ux 


~1/? and gives 


[rns dx = x"J,(x) + ¢. 


(27) xu" + 2 +1 - pu =0. _ 
22. Basic integral formulas. Show that 
18. Elementary Bessel functions. Derive (22) in Example 3 | xy 41x) dx = —x (x) +c, 
from (27). 


19-25 | APPLICATION OF (21): DERIVATIVES, 


te 1%) dx = [rea — 25,(x). 


INTEGRALS 23. Integration. Show that — [.x7Jo(x) dx = x°J,(x) + 
Use the powerful formulas (21) to do Probs. 19-25. Show xJo(x) — fJo(x) dx. (The last integral is nonelemen- 
the details of your work. tary; tables exist, e.g., in Ref. [A13] in App. 1.) 
: 1 
19. Derivatives. Show that Ji(x) = —Jy(x), Ji(x) = 24 Integration. Evaluate fx" Ja(x) dx. 


Jo(x) — JxQ0)/x, 


J5(x) = 3[ Jy(x) — Ja(x)]. 25. Integration. Evaluate fJ5(x) dx. 


20. Bessel’s equation. Derive (1) from (21). 


5.5 Bessel Functions Y,(x). General Solution 


To obtain a general solution of Bessel’s equation (1), Sec. 5.4, for any v, we now introduce 
Bessel functions of the second kind Y,(x), beginning with the case v = n = 0. 
When n = 0, Bessel’s equation can be written (divide by x) 


(1) xy” ty’ +xy=0. 
Then the indicial equation (4) in Sec. 5.4 has a double root r = 0. This is Case 2 in Sec. 


5.3. In this case we first have only one solution, Jo(x). From (8) in Sec. 5.3 we see that 
the desired second solution must be of the form 


CO) ya(x) = Jo(x) Inx + SS) Amx™. 


m=1 


We substitute yo and its derivatives 


Jo 2 
yo = Jolnx + Pie ps mAyx™ + 


m=1 
Ih J oo 
yg =JoInx +E a mlm — Ama”? 
m=1 


into (1). Then the sum of the three logarithmic terms xJ@ In x, Jo In x, and xJo In x is zero 
because Jo is a solution of (1). The terms —Jo/x and Jo/x (from xy” and y’) cancel. Hence 
we are left with 


239+ SS mm — DAnx™ * + SY mAnx™ > + > Amx™** = 0. 


m=1 m=1 m=1 
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Addition of the first and second series gives =mA,,x™— 1. The power series of Jo(x) is 
obtained from (12) in Sec. 5.4 and the use of m!/m = (m — 1)! in the form 


22 (mj? —-3 22m— aT = 


Jox) = > 


m=1 


Together with 2m7A,,x"™ —1 and SA,,x""*?! this gives 


co (=i se? oo co 
(3*) e2 er aes mAy, x 1 + >» Ane? =, 
m=1 2 m! (m ~ I m=1 m=1 


First, we show that the A,,, with odd subscripts are all zero. The power x° occurs only in 
the second series, with coefficient Ay. Hence A; = 0. Next, we consider the even powers 
x75. The first series contains none. In the second series, m — 1 = 2s gives the term 
(25°F iB ge OMe aa In the third series, m + 1 = 2s. Hence by equating the sum of the 
coefficients of x75 to zero we have 


(2s + 1)"Aosy1 + Avs—1 = 0, s=1,2,-° 
Since A; = 0, we thus obtain A3 = 0, As = 0,---, successively. 
2st+1 


We now equate the sum of the coefficients of x to zero. For s = 0 this gives 
-1+4A,=0, thus Ag =f. 


For the other values of s we have in the first series in (3*) 2m — 1 = 2s + 1, hence 
m=s +t 1,inthesecondm — 1 = 2s + 1,andinthe thirdm + 1 = 2s + 1. We thus obtain 


(- ie 


Ps + Ds! =F (2s =F 2) Agen + Ags = 0. 
S Pee 


For s = | this yields 
4+ 1644+A,=0, thus Aqg=—qs 


and in general 


—1y""1 1 1 1 _ 
(3) Aam = 52m m2 eas eu eae : m = 1,2,°-°- 
Using the short notations 
1 1 
(4) hy =1 hy =l+ote 4 m = 2,3,-°° 
2 m 
and inserting (4) and Ay = Ag = -:: = 0 into (2), we obtain the result 
2 ee 
yo(x) = Jo(x) Inx + ee ee pm 
m=1 (m! i, 
l 5» 3 4 11 6 
= Inx + - +: 
©) Jo@) nx + 4x" ~ T9g* * 73,824 
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Since Jo and yo are linearly independent functions, they form a basis of (1) for x > 0. 
Of course, another basis is obtained if we replace ys by an independent particular solution 
of the form a(yg + bJo), where a (# 0) and b are constants. It is customary to choose 
a = 2/7 and b = y — In2, where the number y = 0.57721566490 --- is the so-called 
Euler constant, which is defined as the limit of 


1 
+ Seed om ilmg 


Ni] 


1+ 


as s approaches infinity. The standard particular solution thus obtained is called the Bessel 
function of the second kind of order zero (Fig. 112) or Neumann’s function of order 
zero and is denoted by Yo(x). Thus [see (4)] 


2 ele 
(6) ¥(x) = =| Jo(x) (in - y) +> 


m=1 


m 


For small x > 0 the function Yo(x) behaves about like In x (see Fig. 112, why?), and 
Yo(x) > -~ asx > 0. 


Bessel Functions of the Second Kind Y,,(x) 


For v = n = 1, 2,--- a second solution can be obtained by manipulations similar to those 
for n = 0, starting from (10), Sec. 5.4. It turns out that in these cases the solution also 
contains a logarithmic term. 

The situation is not yet completely satisfactory, because the second solution is defined 
differently, depending on whether the order v is an integer or not. To provide uniformity 
of formalism, it is desirable to adopt a form of the second solution that is valid for all 
values of the order. For this reason we introduce a standard second solution Y,(x) defined 
for all v by the formula 


OM 6 = — ieee el 
(7) sin V7T 
(b) YG) = lim ¥,(x). 


This function is called the Bessel function of the second kind of order v or Neumann’s 
function’ of order v. Figure 112 shows Yo(x) and ¥,(x). 

Let us show that J, and Y, are indeed linearly independent for all v (and x > 0). 

For noninteger order v, the function Y,(x) is evidently a solution of Bessel’s equation 
because J,(x) and J_,,(x) are solutions of that equation. Since for those v the solutions 
J, and J_,, are linearly independent and Y, involves J_,, the functions J, and Y, are 


7 CARL NEUMANN (1832-1925), German mathematician and physicist. His work on potential theory using 
integer equation methods inspired VITO VOLTERRA (1800-1940) of Rome, ERIK TVAR FREDHOLM (1866-1927) 
of Stockholm, and DAVID HILBERT (1962-1943) of Gottingen (see the footnote in Sec. 7.9) to develop the field 
of integral equations. For details see Birkhoff, G. and E. Kreyszig, The Establishment of Functional Analysis, Historia 
Mathematica 11 (1984), pp. 258-321. 

The solutions ¥,(x) are sometimes denoted by N,(x); in Ref. [A13] they are called Weber’s functions; Euler’s 
constant in (6) is often denoted by C or In y. 
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THEOREM -1 


0.5 


Fig. 112. Bessel functions of the second kind Yo and 4. 
(For a small table, see App. 5.) 


linearly independent. Furthermore, it can be shown that the limit in (7b) exists and Y, 
is a solution of Bessel’s equation for integer order; see Ref. [A13] in App. 1. We shall 
see that the series development of Y,,(x) contains a logarithmic term. Hence J,(x) and 
Y,(x) are linearly independent solutions of Bessel’s equation. The series development 
of ¥,(x) can be obtained if we insert the series (20) in Sec. 5.4 and (2) in this section 
for J,(x) and J_, (x) into (7a) and then let v approach n; for details see Ref. [A13]. The 
result is 


ce m-1 
Ya) = = - Int (In ryt EOS 1)" (im + Ren) 2m 


Ti, 22+! (m + n)! 
(8) re 
ee Ss (n= m=)! am 
eae ae 


where x > 0,n = 0, 1,---, and [as in (4)] hp = 0, hy = 1, 


1 1 1 1 
hm =lts+e4+—, Iman =1te4+-4 
se 2 m een 2 mtn 


For n = 0 the last sum in (8) is to be replaced by 0 [giving agreement with (6)]. 
Furthermore, it can be shown that 


¥_y(x) = (—1)"¥,@). 


Our main result may now be formulated as follows. 


General Solution of Bessel’s Equation 


A general solution of Bessel’s equation for all values of v (and x > 0) is 


(9) yx) = CyJ,X) + Co¥,). 


We finally mention that there is a practical need for solutions of Bessel’s equation that 
are complex for real values of x. For this purpose the solutions 


HDG) = J,@) + i%,00) 


10 
—s HP) = I) - i¥%,@) 
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are frequently used. These linearly independent functions are called Bessel functions of 
the third kind of order v or first and second Hankel functions® of order v. 

This finishes our discussion on Bessel functions, except for their “orthogonality,” which 
we explain in Sec. 11.6. Applications to vibrations follow in Sec. 12.10. 


PROBLEM SET 5-5 


1-9 


FURTHER ODE’s REDUCIBLE 


TO BESSEL’S ODE 


Find a general solution in terms of J, and Y¥,. Indicate 
whether you could also use J_, instead of Y,. Use the 
indicated substitution. Show the details of your work. 


1. x2y” + xy’ + &? - 16) y =0 

2. xy" + 5y’ + xy =0 (y = u/x?) 

3. 9x2y"” + Oxy’ + 36x* — 16)y =0 (x? = 2) 

4.y" +xy=0 (v= uVx, 3x?” =z) 

5. 4xy" + 4y' +y=0 (Ve=2) 

6. xy" + y' + 36y =0 (12Vx =2) 

7. y" + k*x*y =0 (y = uVx, 5kx? = 2) 

8. y” kexty =0 (y =uVx, zk? = z) 

9. xy” —5y’ +xy =0 (y = x3u) 
10. CAS EXPERIMENT. Bessel Functions for Large x. 


It can be shown that for large x, 


(11) ¥, (x) ~ V2/(rx) sin (x — $n — 477) 

with ~ defined as in (14) of Sec. 5.4. 

(a) Graph ¥,,(x) forn = 0,---, 5 on common axes. Are 
there relations between zeros of one function and 
extrema of another? For what functions? 

(b) Find out from graphs from which x = x, on the 
curves of (8) and (11) (both obtained from your CAS) 
practically coincide. How does x, change with n? 


(c) Calculate the first ten zeros x», m = 1,---, 10, of 
Yo(x) from your CAS and from (11). How does the error 
behave as m increases? 


(d) Do (c) for Y;(x) and Y5(x). How do the errors 
compare to those in (c)? 


11-15| HANKEL AND MODIFIED 
BESSEL FUNCTIONS 
11. Hankel functions. Show that the Hankel functions (10) 


12. 


13. 


14. 


15. 


form a basis of solutions of Bessel’s equation for any v. 
Modified Bessel functions of the first kind of order 
v are defined by /,(x) = i “J, (ix), i = V—1. Show 
that J, satisfies the ODE 


(12) 20 


xy ot vy = 0. 


Modified Bessel functions. Show that /,(x) has the 
representation 


, 


(x2 t 


c) 2m+v 


Lw= > * 


22mm! Ton + v + 1) 


(13) 


Reality of I. Show that /,(x) is real for all real x (and 
real v), /,(x) # 0 for all real x # 0, and J_y(x) = [,(%), 
where n is any integer. 

Modified Bessel functions of the third kind (sometimes 
called of the second kind) are defined by the formula (14) 
below. Show that they satisfy the ODE (12). 


(14) 


K = id vi I 
v(X) 2sin var [ —y(x) Wa) |. 


CHAPTER 5 REVIEW QUESTIONS AND PROBLEMS 


. Why are we looking for power series solutions of ODEs? 

. What is the difference between the two methods in this 
chapter? Why do we need two methods? 

. What is the indicial equation? Why is it needed? 

. List the three cases of the Frobenius method, and give 
examples of your own. 


. Write down the most important ODEs in this chapter 
from memory. 


6. 


Can a power series solution reduce to a polynomial? 
When? Why is this important? 


. What is the hypergeometric equation? Where does the 


name come from? 


. List some properties of the Legendre polynomials. 
. Why did we introduce two kinds of Bessel functions? 
10. 


Can a Bessel function reduce to an elementary func- 
tion? When? 


83HERMANN HANKEL (1839-1873), German mathematician. 
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11-20) POWER SERIES METHOD 14. 16(x + 1)*y” + 3y =0 

OR FROBENIUS METHOD 15, x7y" + xy’ + (x? — 5)y = 0 
Find a basis of solutions. Try to identify the series as 16. x“y 2x3 y" + (x? 2)y =0 
expansions of known functions. Show the details of your 17. xy" —(x+ Dy’ +y=0 


work. ” ! Bee os 
11. y" + 4y=0 18. xy’ + 3y + 4x°y =0 


12. xy” + (1 — 2x)y’ + (@- Dy =0 y 
BeGHiyy -G= ly -s5y 0 20. xy” + y’ — xy =0 


SUMMARY-OF-CHAPTER-D 


Series Solution of ODEs. Special Functions 


The power series method gives solutions of linear ODEs 
(1) y” + pQ)y’ + qi@oy = 0 


with variable coefficients p and q in the form of a power series (with any center xo, 
e.g., Xo = 0) 


co 


(2) ya) = Dd) ame — x0)” = ao + ay(x — x0) + aa(x — xo) + °°. 


m=0 


Such a solution is obtained by substituting (2) and its derivatives into (1). This gives 
a recurrence formula for the coefficients. You may program this formula (or even 
obtain and graph the whole solution) on your CAS. 

If p and q are analytic at xo (that is, representable by a power series in powers 
of x — x9 with positive radius of convergence; Sec. 5.1), then (1) has solutions of 
this form (2). The same holds if , Pp, q in 


A(Qoy” + py’ + Fody = 0 


are analytic at x9 and h(x) # 0, so that we can divide by h and obtain the standard 
form (1). Legendre’s equation is solved by the power series method in Sec. 5.2. 
The Frobenius method (Sec. 5.3) extends the power series method to ODEs 
n ax), b(x) 
yo yo 
k= XG (x — xo) 


(3) 


whose coefficients are singular (i.e., not analytic) at x9, but are “not too bad,” 
namely, such that a and b are analytic at x9. Then (3) has at least one solution of 
the form 


(4) y(x) = (& — x0)” Sam (x — x0)” = ag(x — x0)" + ay(x — xp)" **2 ++ 
m=0 
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where r can be any real (or even complex) number and is determined by substituting 
(4) into (3) from the indicial equation (Sec. 5.3), along with the coefficients of (4). 
A second linearly independent solution of (3) may be of a similar form (with different 
r and d,,’s) or may involve a logarithmic term. Bessel’s equation is solved by the 
Frobenius method in Secs. 5.4 and 5.5. 

“Special functions” is a common name for higher functions, as opposed to the 
usual functions of calculus. Most of them arise either as nonelementary integrals [see 
(24)-(44) in App. 3.1] or as solutions of (1) or (3). They get a name and notation 
and are included in the usual CASs if they are important in application or in theory. 
Of this kind, and particularly useful to the engineer and physicist, are Legendre’s 
equation and polynomials Po, P,, - +: (Sec. 5.2), Gauss’s hypergeometric equation 
and functions F(a, b, c; x) (Sec. 5.3), and Bessel’s equation and functions J, and 
Y, (Secs. 5.4, 5.5). 


CHAPTER 6 


Laplace Transforms 


Laplace transforms are invaluable for any engineer’s mathematical toolbox as they make 
solving linear ODEs and related initial value problems, as well as systems of linear ODEs, 
much easier. Applications abound: electrical networks, springs, mixing problems, signal 
processing, and other areas of engineering and physics. 

The process of solving an ODE using the Laplace transform method consists of three 
steps, shown schematically in Fig. 113: 


Step I. The given ODE is transformed into an algebraic equation, called the subsidiary 
equation. 
Step 2. The subsidiary equation is solved by purely algebraic manipulations. 


Step 3. The solution in Step 2 is transformed back, resulting in the solution of the given 
problem. 


IVP AP Solving Solution 
Initial Value Algebraic} /————>| AP -——>| of the 
Problem @ | Problem @ | byAlgebra | ©) IVP 


Fig. 113. Solving an IVP by Laplace transforms 


The key motivation for learning about Laplace transforms is that the process of solving 
an ODE is simplified to an algebraic problem (and transformations). This type of 
mathematics that converts problems of calculus to algebraic problems is known as 
operational calculus. The Laplace transform method has two main advantages over the 
methods discussed in Chaps. 1-4: 


I. Problems are solved more directly: Initial value problems are solved without first 
determining a general solution. Nonhomogenous ODEs are solved without first solving 
the corresponding homogeneous ODE. 


II. More importantly, the use of the unit step function (Heaviside function in Sec. 6.3) 
and Dirac’s delta (in Sec. 6.4) make the method particularly powerful for problems with 
inputs (driving forces) that have discontinuities or represent short impulses or complicated 
periodic functions. 


203 


204 


CHAP. 6 Laplace Transforms 


The following chart shows where to find information on the Laplace transform in this 
book. 


Topic Where to find it 
ODEs, engineering applications and Laplace transforms Chapter 6 
PDEs, engineering applications and Laplace transforms Section 12.11 
List of general formulas of Laplace transforms Section 6.8 

List of Laplace transforms and inverses Section 6.9 
Note: Your CAS can handle most Laplace transforms. 


Prerequisite: Chap. 2 
Sections that may be omitted in a shorter course: 6.5, 6.7 
References and Answers to Problems: App. 1 Part A, App. 2. 


6.| Laplace Transform. Linearity. 


First Shifting Theorem (s-Shifting) 


In this section, we learn about Laplace transforms and some of their properties. Because 
Laplace transforms are of basic importance to the engineer, the student should pay close 
attention to the material. Applications to ODEs follow in the next section. 

Roughly speaking, the Laplace transform, when applied to a function, changes that 
function into a new function by using a process that involves integration. Details are as 
follows. 

If f(¢) is a function defined for all t = 0, its Laplace transform’ is the integral of f(1) 
times e ** from t = 0 to ~. It is a function of s, say, F(s), and is denoted by £(f); thus 


(1) F(s) = £(f) = | e “f() dt. 


0 


Here we must assume that f(‘) is such that the integral exists (that is, has some finite 
value). This assumption is usually satisfied in applications—we shall discuss this near the 
end of the section. 


1 PIERRE SIMON MARQUIS DE LAPLACE (1749-1827), great French mathematician, was a professor in 
Paris. He developed the foundation of potential theory and made important contributions to celestial mechanics, 
astronomy in general, special functions, and probability theory. Napoléon Bonaparte was his student for a year. 
For Laplace’s interesting political involvements, see Ref. [GenRef2], listed in App. 1. 

The powerful practical Laplace transform techniques were developed over a century later by the English 
electrical engineer OLIVER HEAVISIDE (1850-1925) and were often called “Heaviside calculus.” 

We shall drop variables when this simplifies formulas without causing confusion. For instance, in (1) we 
wrote £( f) instead of £( f)(s) and in (1*) ¥-1(F) instead of LT (FVD). 
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EXAMPLE 1 


EXAMPLE 2 


Not only is the result F(s) called the Laplace transform, but the operation just described, 
which yields F(s) from a given f(4), is also called the Laplace transform. It is an “integral 
transform” 


F(s) = | k(s, )f() dt 


0 


with “kernel” k(s, ) = e**. 


Note that the Laplace transform is called an integral transform because it transforms 
(changes) a function in one space to a function in another space by a process of integration 
that involves a kernel. The kernel or kernel function is a function of the variables in the 
two spaces and defines the integral transform. 

Furthermore, the given function f(f) in (1) is called the inverse transform of F(s) and 
is denoted by LIF ); that is, we shall write 


(1*) fOjaS Le) 
Note that (1) and (1*) together imply mers a) = f and L(L-\(F)) = F. 


Notation 


Original functions depend on f¢ and their transforms on s—keep this in mind! Original 
functions are denoted by lowercase letters and their transforms by the same letters in capital, 
so that F(s) denotes the transform of f(t), and Y(s) denotes the transform of y(t), and so on. 


Laplace Transform 
Let f(t) = 1 when t 2 O. Find F(s). 


Solution. From (1) we obtain by integration 


* 1 
#65) = £0) = | ee dt= =e" =— (s > 0). 
0 


Such an integral is called an improper integral and, by definition, is evaluated according to the rule 


T 


| e “"£(t) dt = lim | ef (t) dt. 
0 


0 


Hence our convenient notation means 


x T 
1 1 1 1 
| edt = lim | . a lim | = es? 4 | (s > 0). 


Tx 
0 * 0 


We shall use this notation throughout this chapter. Bo 


Laplace Transform £(e%) of the Exponential Function e™ 
Let f() = e when t = 0, where a is a constant. Find £(f). 


Solution. Again by (1), 


oo 


Es 
= I age 
Le") = | e st pat dt = e (s-a)t 
a—s 
0 0 


hence, when s — a > 0, 


1 


s-=@ 


Le") = 
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Must we go on in this fashion and obtain the transform of one function after another 
directly from the definition? No! We can obtain new transforms from known ones by the 
use of the many general properties of the Laplace transform. Above all, the Laplace 
transform is a “linear operation,” just as are differentiation and integration. By this we 
mean the following. 


THEOREM 1 Linearity of the Laplace Transform 


The Laplace transform is a linear operation; that is, for any functions f(t) and g(t) 
whose transforms exist and any constants a and b the transform of af(t) + bg(t) 
exists, and 


Li af) + bg} = aLt{fO} + bL{ gO}. 


PROOF This is true because integration is a linear operation so that (1) gives 


L£{af(t) + bg(t)} | eC af(t) + bg(t)] dt 


0 


a | ef(0) dt + b | e“g(0) dt = aL(f()} + bL{gO}. 
0 0 


EXAMPLE 3. Application of Theorem 1: Hyperbolic Functions 


Find the transforms of cosh at and sinh at. 


Solution. Since cosh at = de + e~%) and sinh at = de —e~), we obtain from Example 2 and 
Theorem 1 


L£(cosh at) = 5 fle") + £(e~")) = 


L£(sinh at) = 5 fle") — £(e~")) 


EXAMPLE 4 _ Cosine and Sine 


Derive the formulas 


&£(sin wt) = 


L£(cos wt) = ; > 
So + @ So + @ 


Solution. We write L, = L(cos wt) and L, = £(sin wf). Integrating by parts and noting that the integral- 
free parts give no contribution from the upper limit ©, we obtain 


—st 


Pe e 
Le = { e*' cos wt dt = — 
0 


@ 7 —st .: 1 @ 
——]| e sin wt dt = —— —L,g, 
5], s Ss 


—st 


L.= st: = e : 
s= e” sin wt dt = sin wt 


—Ss 
0 


Oh @ 
+—] e cos wt dt = —Le. 
s Ss 
0 0 


SEC. 6.1 Laplace Transform. Linearity. First Shifting Theorem (s-Shifting) 207 


By substituting L, into the formula for L, on the right and then by substituting L, into the formula for L, on 
the right, we obtain 


lal 
icy 
| 
ale 
| 
alé& 
i 
“ale 
wn) 
fey 
SS 
tw 
is} 
ia 
NI] “ho 
ee 
a 
len) 
oO 
cy 
i) 
t | 
S 
iN) 


Lees! 
a 
| 
| 
oN 
i 
| 
| 
| wast 
a 
eae 
tN 
an 
a 
N| “bw 
ee 
i) 
| a 
a 
Nw 
& 
i) 
a 


Basic transforms are listed in Table 6.1. We shall see that from these almost all the others 
can be obtained by the use of the general properties of the Laplace transform. Formulas 
1-3 are special cases of formula 4, which is proved by induction. Indeed, it is true for 
n = 0 because of Example | and 0! = 1. We make the induction hypothesis that it holds 
for any integer n = 0 and then get it for n + 1 directly from (1). Indeed, integration by 
parts first gives 


oo 


2 os 
i = | e stpn +1 oy = =e stpn+1 
0 


Now the integral-free part is zero and the last part is (n + 1)/s times L(t”). From this 
and the induction hypothesis, 


aaa | ‘ n+1n! (n + 1)! 
gent) = £0") = Pee ee 
S S S Ss 


This proves formula 4. 


Table 6.1 Some Functions f(t) and Their Laplace Transforms £(f) 


FO L£(f) FO L£(f) 
1 1 1/ 7 t 
Ss cos w 
t+ ow 
2 t 1/s? 8 sin wt ae 
aa 
2 3 5 
3 t 21/8 9 cosh at 
sa 
i n! . a 
4 i= 0.1, i) aa 10 sinh at 2_ 2 
a Tia t+ 1) = 
5 é tae — ay 11 ett cos wt ——_-; 
(a positive) got (s-—a)° +a 
at 1 ato: @ 
6 e — 12 e~ sin wt 
Sa (=a)? +a 
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Ta + 1) in formula 5 is the so-called gamma function [(15) in Sec. 5.5 or (24) in 
App. A3.1]. We get formula 5 from (1), setting st = x: 


20 20 a oo 
L0% = | er dt -| (2) a = =| ex dx 


0 0 0 


where s >0. The last integral is precisely that defining ['(a + 1), so we have 
T(a + 1)/s**1, as claimed. (CAUTION! I'(a + 1) has x® in the integral, not x°* 1.) 
Note the formula 4 also follows from 5 because [(n + 1) = n! for integer n 2 0. 
Formulas 6-10 were proved in Examples 2-4. Formulas 11 and 12 will follow from 7 
and 8 by “shifting,” to which we turn next. 


s-Shifting: Replacing s by s — a in the Transform 


The Laplace transform has the very useful property that, if we know the transform of f(d), 
we can immediately get that of e“f(t), as follows. 


First Shifting Theorem, s-Shifting 
If f(0 has the transform F(s) (where s > k for some k), then e“ f(t) has the transform 


F(s — a) (where s — a> k). In formulas, 
LlefO} = Fs — a) 


or, if we take the inverse on both sides, 


e“F(t) = £1{F(s — a)}. 


We obtain F(s — a) by replacing s with s — a in the integral in (1), so that 


lo) 


F(s — a) = | cau | ee“ f(t)] dt = L{e“f(p}. 


0 0 


If F(s) exists (i.e., is finite) for s greater than some k, then our first integral exists for 
s — a > k. Now take the inverse on both sides of this formula to obtain the second formula 
in the theorem. (CAUTION! —a in F(s — a) but +a in e“F(t).) a 


s-Shifting: Damped Vibrations. Completing the Square 


From Example 4 and the first shifting theorem we immediately obtain formulas 11 and 12 in Table 6.1, 


Sg @ 


L{e“ cos wt} = L{e sin wt} = = 


= ay? + ow io= art+a 
For instance, use these formulas to find the inverse of the transform 


3s — 137 


Lf) ==. 
s+ 2s + 401 
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Solution. Applying the inverse transform, using its linearity (Prob. 24), and completing the square, we obtain 


f xf 3(s + 1) - + 20H st] \ 130-¥ 20 \ 
~ ls +124 4003” (s + 1)? + 202 (s + 12 + 202)" 


We now see that the inverse of the right side is the damped vibration (Fig. 114) 


f(t) = e~ "3 cos 20t — 7 sin 207). eal 


OL lols +t ea Aes 


a 
leet 


Fig. 114. Vibrations in Example 5 


Existence and Uniqueness of Laplace Transforms 


This is not a big practical problem because in most cases we can check the solution of 
an ODE without too much trouble. Nevertheless we should be aware of some basic facts. 

A function f(¢) has a Laplace transform if it does not grow too fast, say, if for all t = 0 
and some constants M and k it satisfies the “growth restriction” 


(2) [f(@)| = Me™. 


(The growth restriction (2) is sometimes called “growth of exponential order,” which may 
be misleading since it hides that the exponent must be kt, not kt” or similar.) 

f(@ need not be continuous, but it should not be too bad. The technical term (generally 
used in mathematics) is piecewise continuity. f(t) is piecewise continuous on a finite 
interval a S t S b where f is defined, if this interval can be divided into finitely many 
subintervals in each of which fis continuous and has finite limits as t approaches either 
endpoint of such a subinterval from the interior. This then gives finite jumps as in 
Fig. 115 as the only possible discontinuities, but this suffices in most applications, and 
so does the following theorem. 


ae 
ic b 


L 
a t 


Fig. 115. Example of a piecewise continuous function f(t). 
(The dots mark the function values at the jumps.) 
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Existence Theorem for Laplace Transforms 


If f(® is defined and piecewise continuous on every finite interval on the semi-axis 
t 2 0 and satisfies (2) for all t= 0 and some constants M and k, then the Laplace 
transform Lf) exists for all s > k. 


Since f(t) is piecewise continuous, e~**f(t) is integrable over any finite interval on the 
t-axis. From (2), assuming that s > k (to be needed for the existence of the last of the 
following integrals), we obtain the proof of the existence of L( f) from 


oo 


IL(f)| = | If@le—** at = | Me*te—**t dt = a 


0 0 


| e “F(t) dt 


0 


s—k 


Note that (2) can be readily checked. For instance, cosh t < et < nie (because t”/n! 
is a single term of the Maclaurin series), and so on. A function that does not satisfy (2) 
for any M and k is et (take logarithms to see it). We mention that the conditions in 
Theorem 3 are sufficient rather than necessary (see Prob. 22). 


Uniqueness. If the Laplace transform of a given function exists, it is uniquely 
determined. Conversely, it can be shown that if two functions (both defined on the positive 
real axis) have the same transform, these functions cannot differ over an interval of positive 
length, although they may differ at isolated points (see Ref. [A14] in App. 1). Hence we 
may say that the inverse of a given transform is essentially unique. In particular, if two 
continuous functions have the same transform, they are completely identical. 


PROBLEM SET 6-1 


1-16 


1. 3t + 12 


. Si 


erm w 


11. 2 


13. 


» COS Tt 
en 


¢ sinh ¢ 
n (wt + 0) 


PS 
a 


LAPLACE TRANSFORMS 


Find the transform. Show the details of your work. Assume 
that a, b, w, 0 are constants. 


15. 


16. 
A 


en eee 
2. (a — bt)? : : + 
4. cos” wt 17-24| SOME THEORY 
6. e~* sinh 4r 17. Table 6.1. Convert this table to a table for finding 
8. 1.5 sin 3t — 7/2) inverse transforms (with obvious changes, e.g., 
10. L1/s") = t"1/(n — 1), ete). 
ke— 18. Using £(f) in Prob. 10, find L( f,), where f4(t) = 0 
— if tS 2 and f,() = Lift > 2. 
19. Table 6.1. Derive formula 6 from formulas 9 and 10. 
12. 20. Nonexistence. Show that e’ does not satisfy a 
iL condition of the form (2). 
A | 21. Nonexistence. Give simple examples of functions 
1 2 (defined for all f20) that have no Laplace 
14. transform. 
it rn 22. Existence. Show that £(1/Vt) = V2r/s. [Use (30) 
| | T (5) = V7 in App. 3.1.] Conclude from this that the 
a b conditions in Theorem 3 are sufficient but not 


necessary for the existence of a Laplace transform. 
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23. Change of scale. If £0 f(t) = F(s) and c is any 
positive constant, show that L( f(ct)) = F(s/c)/c (Hint: 
Use (1).) Use this to obtain £(cos wt) from L(cos ft). 


24. Inverse transform. Prove that £~! is linear. Hint: 
Use the fact that & is linear. 


25-32 | INVERSE LAPLACE TRANSFORMS 


Given F(s) = £(f), find f(d. a, b, L, n are constants. Show 
the details of your work. 


2s + 1. +1 
25. Oost 1S = 26. ol 
g° + 3,24 Ss =25 
a 28. : 
Ls? + n?a? (s + V2)(s — V3) 
S++ 
7 30, SF 
Ss Ss s“ — 16 
s+ 10 1 
1. ———_ 2... 
: sy—s-2 (s + a)(s + b) 
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33-45 | APPLICATION OF s-SHIFTING 
In Probs. 33-36 find the transform. In Probs. 37-45 find 
the inverse transform. Show the details of your work. 


33. 17e~ 3 34. ke~™ cos wt 
35. 0.5e—* sin 27rt 36. sinh ¢t cos t 
7 6 
37. aE 38. 3 
(s + 7) (s + 1) 
39. —— 40. _ + 
(s + V2)4 S29 =3 
T 
41. 


s* + 10as + 24a? 


do ay (4D) 


sors i (s+1?%° (+18 
25-1 a(s +k) + ba 
43, = ra ai 
s° — 6s + 18 (s + k)° + 7 


ko(s a a) a ky 
45. ire Ta 
(s + a) 


6.2 Transforms of Derivatives and Integrals. 


ODEs 


The Laplace transform is a method of solving ODEs and initial value problems. The crucial 
idea is that operations of calculus on functions are replaced by operations of algebra 
on transforms. Roughly, differentiation of f(t) will correspond to multiplication of £(f) 
by s (see Theorems | and 2) and integration of f(t) to division of £(f) by s. To solve 
ODEs, we must first consider the Laplace transform of derivatives. You have encountered 
such an idea in your study of logarithms. Under the application of the natural logarithm, 
a product of numbers becomes a sum of their logarithms, a division of numbers becomes 
their difference of logarithms (see Appendix 3, formulas (2), (3)). To simplify calculations 
was one of the main reasons that logarithms were invented in pre-computer times. 


THEOREM -1 


Laplace Transform of Derivatives 


The transforms of the first and second derivatives of f(t) satisfy 


(1) LF y= sf f) — 70) 
(2) LF) = s LG) — 0) — Ff ©). 


Formula (1) holds if f(t) is continuous for all t 20 and satisfies the growth 
restriction (2) in Sec. 6.1 and f'(t) is piecewise continuous on every finite interval 
on the semi-axis t = 0. Similarly, (2) holds if f and f' are continuous for all t 2 0 
and satisfy the growth restriction and f" is piecewise continuous on every finite 
interval on the semi-axis t = 0. 
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We prove (1) first under the additional assumption that f’ is continuous. Then, by the 
definition and integration by parts, 


Lf’) = | ef dt =[e“FO]| +5 | eo S*¢(t) dt. 
0 


0 0 


Since f satisfies (2) in Sec. 6.1, the integrated part on the right is zero at the upper limit 
when s > k, and at the lower limit it contributes —f(0). The last integral is L( f). It exists 
for s > k because of Theorem 3 in Sec. 6.1. Hence Lf) exists when s > k and (1) holds. 
If f’ is merely piecewise continuous, the proof is similar. In this case the interval of 
integration of f’ must be broken up into parts such that f’ is continuous in each such part. 
The proof of (2) now follows by applying (1) to f” and then substituting (1), that is 


Lf") = s£(F') — f'O = ssL(f) -(O]=PLf/)-FO-fO. a 


Continuing by substitution as in the proof of (2) and using induction, we obtain the 
following extension of Theorem 1. 


Laplace Transform of the Derivative f"") of Any Order 


Let f, f',++* yaad be continuous for all t 2 0 and satisfy the growth restriction 
(2) in Sec. 6.1. Furthermore, let f be piecewise continuous on every finite interval 
on the semi-axis t = 0. Then the transform of f ne satisfies 


(3) LF™) = sPLCF) — 8? FO) — s™ "FO — 1 — FPO). 


Transform of a Resonance Term (Sec. 2.8) 


Let f(t) = tsin wt. Then f(0) = 0, f’(t) = sin wt + wt cos wt, f’(0) = 0,f" = 2w cos wt — wt sin wt. Hence 
by (2), 


Lf") = 2w . wL(f) = 2° L(f), thus L(f) = Lt sin wt) = eee al 


se t+ ar (s2 + @2)2 


Formulas 7 and 8 in Table 6.1, Sec. 6.1 


This is a third derivation of £(cos wt) and £(sin wt); cf. Example 4 in Sec. 6.1. Let f(t) = cos wt. Then 
f(0) = 1, f'(0) = 0, f" (0) w” cos wt. From this and (2) we obtain 


Lf") = s?£(f) -—s =e Lf). By algebra, £(cos wf) = a 
sw 


Similarly, let g = sin wt. Then g(0) = 0, g’ = w cos wt. From this and (1) we obtain 


£(g') = s£(g) = wL(cos wt). Hence, L(sin wt) = ° (cos wt) = 7 
S so + wy 


Laplace Transform of the Integral of a Function 


Differentiation and integration are inverse operations, and so are multiplication and division. 
Since differentiation of a function f(t) (roughly) corresponds to multiplication of its transform 
L£(f) by s, we expect integration of f(t) to correspond to division of £( f) by s: 
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Laplace Transform of Integral 


Let F(s) denote the transform of a function f(t) which is piecewise continuous for t = 0 
and satisfies a growth restriction (2), Sec. 6.1. Then, for s > 0, s > k, andt > 0, 


i if 
(4) { | f(r) ar} = *F), thus | f(t) dr = {ta}. 
0 0 


Denote the integral in (4) by g(t). Since f(‘) is piecewise continuous, g(t) is continuous, 
and (2), Sec. 6.1, gives 


t t 
= | lf(7)| | el dr =F (el pst (k > 0). 
0 0 


t 
| f(r) dt 


0 


le(O| = 


This shows that g(t) also satisfies a growth restriction. Also, g’(f) = f(t), except at points 
at which f(A) is discontinuous. Hence g'(t) is piecewise continuous on each finite interval 
and, by Theorem 1, since g(0) = O (the integral from 0 to 0 is zero) 


L{FO} = L{g'(O} = s£{gO} — gO) = s£L{g}. 


Division by s and interchange of the left and right sides gives the first formula in (4), 
from which the second follows by taking the inverse transform on both sides. ia 


Application of Theorem 3: Formulas 19 and 20 in the Table of Sec. 6.9 


1 1 
nd 


a : 
s(s2 + w”) s2(s2 + w”) 


Using Theorem 3, find the inverse of 


Solution. From Table 6.1 in Sec. 6.1 and the integration in (4) (second formula with the sides interchanged) 


we obtain 
: tf 2s 
1 sin wt 1 sin wt 1 
got = . got { dt 1 — cos wf). 
{3 + a} w Le + a} 7 w eo aH) 


This is formula 19 in Sec. 6.9. Integrating this result again and using (4) as before, we obtain formula 20 


in Sec. 6.9: 
1 i inwr |) ot si 
T sin wT sin wT 
oy \ | (1 — cos wrt) dt = ? 
s*(s? + w”) wo 0 wo ow 0 wo ow? 


It is typical that results such as these can be found in several ways. In this example, try partial fraction 
reduction. @ 


Differential Equations, Initial Value Problems 


Let us now discuss how the Laplace transform method solves ODEs and initial value 
problems. We consider an initial value problem 


(5) y" +ay' +by=r), yO)=Ko, y(0)=Ki 
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where a and b are constant. Here r(t) is the given input (driving force) applied to the 
mechanical or electrical system and y(f) is the output (response to the input) to be obtained. 
In Laplace’s method we do three steps: 


Step 1. Setting up the subsidiary equation. This is an algebraic equation for the transform 
Y = £(y) obtained by transforming (5) by means of (1) and (2), namely, 


[s?¥ — sy(0) — y'(0)] + a[sY — y(0)] + bY = R(s) 
where R(s) = L£(r). Collecting the Y-terms, we have the subsidiary equation 
(s? + as + b)Y = (s + a)y(O) + y'(0) + R(s). 


Step 2. Solution of the subsidiary equation by algebra. We divide by s2 + as + b and 
use the so-called transfer function 


1 1 
s?tastb Ge sae pe 


(6) O(s) = 


(Q is often denoted by H, but we need H much more frequently for other purposes.) This 
gives the solution 


(7) ¥(s) = [(s + a)y(0) + y'O)]O(s) + R(S)Q(). 
If y(0) = y'(0) = 0, this is simply Y = RQ; hence 


Ye L(output) 
R L(input) 


Q= 


and this explains the name of Q. Note that Q depends neither on r(t) nor on the initial 
conditions (but only on a and b). 


Step 3. Inversion of Y to obtain y = £~'(Y). We reduce (7) (usually by partial fractions 
as in calculus) to a sum of terms whose inverses can be found from the tables (e.g., in 
Sec. 6.1 or Sec. 6.9) or by a CAS, so that we obtain the solution y(t) = gly) of (5). 
Initial Value Problem: The Basic Laplace Steps 


Solve 
yi -y=t yO=rH1 yOH1. 


Solution. Step 1. From (2) and Table 6.1 we get the subsidiary equation [with Y = £(y)] 


s’Y — sy) — y'(0) — Y=1/s?, thus (s? — DY =s + 1 + 1/s”. 


Step 2. The transfer function is Q = 1/(s” — 1), and (7) becomes 


1 s+1 1 
Y=(s+ )O+—O 


s2 s2—1 s2(s2 =) 


Simplification of the first fraction and an expansion of the last fraction gives 
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Step 3. From this expression for Y and Table 6.1 we obtain the solution 


y(t) gl) ey} + aa ae | otal e’ + sinht — ¢. 


The diagram in Fig. 116 summarizes our approach. 


t-space s-space 


Given problem Subsidiary equation 
y"-y=t (s?-1)Y=s+1+ l/s? 
y(0) = 1 
y(0) =1 


Solution of given problem Solution of subsidiary equation 


y(é) =e' + sinh t-¢ area bates eh 
as eae 


Fig. 116. Steps of the Laplace transform method 


Comparison with the Usual Method 


Solve the initial value problem 
y" +y'+9y=0.  y(0) = 0.16, -y'(0) = 0. 


Solution. From (1) and (2) we see that the subsidiary equation is 


s°¥Y — 0.165 + sY—-0.16+9Y=0, thus (92? +94 9)¥ = 0.16(s + 1). 
The solution is 


0.16(9 + 1) 0.16(s + 4) + 0.08 


sr+s4+9 (sta? +2 


Hence by the first shifting theorem and the formulas for cos and sin in Table 6.1 we obtain 


35 0.08 35 
(0.16 cos t+ sin 7 
V4 1\/35 4 


e °°*(0.16 cos 2.96t + 0.027 sin 2.961). 


y(t) = £7") 


This agrees with Example 2, Case (III) in Sec. 2.4. The work was less. 
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Advantages of the Laplace Method 
1. Solving a nonhomogeneous ODE does not require first solving the 
homogeneous ODE. See Example 4. 
2. Initial values are automatically taken care of. See Examples 4 and 5. 


3. Complicated inputs r(t) (right sides of linear ODEs) can be handled very 
efficiently, as we show in the next sections. 
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Shifted Data Problems 


This means initial value problems with initial conditions given at some t = to > 0 instead of t = 0. For such a 
problem set t = 7 + fo, so that t = fg gives = 0 and the Laplace transform can be applied. For instance, solve 
yam) =2- V2. 


y’ +y=2t, yQGm)=57, 


Solution. We have to = 47 and we set t = 7 + it. Then the problem is 


P+¥=2+4am), FO=a7, FHO=2-Vv2 
where (1) = y(t). Using (2) and Table 6.1 and denoting the transform of ¥ by Y. , we see that the subsidiary 
equation of the “shifted” initial value problem is 


oe = eae ne ee: 
SY—s-5m7-(2-V2I+Y=44 , tls (7+ D¥Y=34 +—T +2- V2. 
Ss Ss Ss Ss 
Solving this algebraically for y , we obtain 
2 27 378 2- v2 


Y 
(s2 + 1)s% (9? + Ls 


sit] 


The inverse of the first two terms can be seen from Example 3 (with w = 1), and the last two terms give cos 
and sin, 


y LU) 2 — sin?) 4 da(1 cost) 4 da cost + (2 — V2) sint 
= + aq — V2 sint. 


as 1 
Now t =t— 4a, sint = Wa (sin t — cos f), so that the answer (the solution) is 
2 


y = 2t— sint + cost. a 


PROBLEEM—SET 6.2 


1-11| INITIAL VALUE PROBLEMS (IVPS) 


Solve the IVPs by the Laplace transform. If necessary, use 


12-15| SHIFTED DATA PROBLEMS 
Solve the shifted data [VPs by the Laplace transform. Show 


y” -4y=0, y0)=12, y'O)=0 


15. y” + 3y’ —4y= 6e22- 3, y(1.5) = 4, 


partial fraction expansion as in Example 4 of the text. Show the details. 

all details. 12. y" — 2y’ — 3y =0, y(4) = -3, 

1. yy’ + 5.2y = 19.4 sin 21, (0) =0 y'(4) = -17 

, _— — 
2. = 2y =0, (0) = 15 13. y' — 6y =0, y(-1) =4 
3.y —y —6y=0, y(0) = 11, 0) = 28 
2 aS yO) yO 14. y” + 2y’ + 5y = 50r— 100, (2) = -4, 

4, y° + 9y = 10e™",  y(0) = 0, y (0) =0 y'(2) = 14 

5 

6 


. y” — by’ + Sy = 29 cos 21, 

y'(0) = 6.2 
wy” + Ty’ + Ly = 21e%", y(0) = 3.5, 
y'(0) = —10 


y(0) = 3.2, 


I 


y'(1.5) = 5 


16-21) OBTAINING TRANSFORMS 


BY DIFFERENTIATION 


8. y" — 4y’ + 4y=0, y(0) = 8.1, y’(0) = 3.9 
9, y" — 4y’ + 3y = 6r—-8, y(0)=0, y')=0 
10. y" + 0.04y = 0.0217, y(0) = —25, y'() =0 


11. 


i, 


2.25y = 913 + 64, yO) = 1, 


y 3y 
y’(0) = 31 


5 


Using (1) or (2), find L( f) if f(H equals: 


16. tcos 4t 17. te“ 
18. cos? 2r 19. sin? wt 
20. sin* t. Use Prob. 19. 21. cosh? t 
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22. PROJECT. Further Results by Differentiation. 
Proceeding as in Example 1, obtain 
2 2 
—_— WwW 


(a) L(tcos wt) = 


30. PROJECT. Comments on Sec. 6.2. (a) Give reasons 
why Theorems 1 and 2 are more important than 
Theorem 3. 


(b) Extend Theorem 1 by showing that if f(t) is 


(s? + wy)? 
and from this and Example 1: (b) formula 21, (ce) 22, 
(d) 23 in Sec. 6.9, 


continuous, except for an ordinary discontinuity (finite 
jump) at somet = a (>0), the other conditions remaining 
as in Theorem 1, then (see Fig. 117) 


23-29 


2 2 
Soar a 
Panne” ahaa (1) £(f") = s£(f) — FO) = [fla + 0) = fla = O]e“®. 
2as (c) Verify (1*) for f() = e* if O<+¢< 1 and 0 if 
(f) L(t sinh at) = Gay t> 1. 
(d) Compare the Laplace transform of solving ODEs 
INVERSE TRANSFORMS with the method in Chap. 2. Give examples of your 


Using Theorem 3, find f(‘) if LF) equals: 


BY INTEGRATION 


own to illustrate the advantages of the present method 
(to the extent we have seen them so far). 


3 20 
ao re 24. >_> 
so + s/4 s” — 20s f@) 
i -0) 
l 1 Leone 
: s(s* + w) oe st — 5? ae fla + 0) 
. | 
: ce ~ eee. oe 
s” + 9s s +k*s 0 a i 
= Fig. 117. Formula (1*) 
s+ as” 


6.3 Unit Step Function (Heaviside Function). 
Second Shifting Theorem (t-Shifting) 


This section and the next one are extremely important because we shall now reach the 
point where the Laplace transform method shows its real power in applications and its 
superiority over the classical approach of Chap. 2. The reason is that we shall introduce 
two auxiliary functions, the unit step function or Heaviside function u(t — a) (below) and 
Dirac’s delta 5(t — a) (in Sec. 6.4). These functions are suitable for solving ODEs with 
complicated right sides of considerable engineering interest, such as single waves, inputs 
(driving forces) that are discontinuous or act for some time only, periodic inputs more 
general than just cosine and sine, or impulsive forces acting for an instant (hammerblows, 
for example). 


Unit Step Function (Heaviside Function) u(t — a) 


The unit step function or Heaviside function u(t — a) is 0 for t < a, has a jump of size 
1 at t = a (where we can leave it undefined), and is 1 for t > a, in a formula: 


0 ihe = (0) 
(1) w-a={ 


1 ift>a 


218 


CHAP. 6 Laplace Transforms 


u(t) u(t —a) 
—<$<$<—$—= IP _—=—$—= 
| 
0 t 0 a t 
Fig. 118. Unit step function u(t) Fig. 119. Unit step function u(t — a) 


Figure 118 shows the special case u(t), which has its jump at zero, and Fig. 119 the general 
case u(t — a) for an arbitrary positive a. (For Heaviside, see Sec. 6.1.) 
The transform of u(t — a) follows directly from the defining integral in Sec. 6.1, 


. a —st|* 
(ut — a) = | eur — aar = | et dt = —— : 
0 0 t=a 


here the integration begins at t = a(=0) because u(t — a) is 0 for t < a. Hence 


Cm 


S 


(2) L{ult — a} = (s > 0). 


The unit step function is a typical “engineering function” made to measure for engineering 
applications, which often involve functions (mechanical or electrical driving forces) that 
are either “off” or “on.” Multiplying functions f(f) with u(t — a), we can produce all sorts 
of effects. The simple basic idea is illustrated in Figs. 120 and 121. In Fig. 120 the given 
function is shown in (A). In (B) it is switched off between t = 0 and t = 2 (because 
u(t — 2) = 0 when t < 2) and is switched on beginning at t = 2. In (C) it is shifted to the 
right by 2 units, say, for instance, by 2 sec, so that it begins 2 sec later in the same fashion 
as before. More generally we have the following. 


Let f(t) = 0 for all negative t. Then f(t — a)u(t — a) with a> 0 is f(t) shifted 
(translated) to the right by the amount a. 


Figure 121 shows the effect of many unit step functions, three of them in (A) and 
infinitely many in (B) when continued periodically to the right; this is the effect of a 
rectifier that clips off the negative half-waves of a sinuosidal voltage. CAUTION! Make 
sure that you fully understand these figures, in particular the difference between parts (B) 
and (C) of Fig. 120. Figure 120(C) will be applied next. 


fi) 
5, 5, Si 
| 
(ia { 
| 
a na 2n t e 2n 2x ¢t 9 2 mt+2 2n+2 t 
-5- -5- -5- 


(A) f(t) =5 sint (B) f)ult — 2) (C) f(t — 2)u(t — 2) 


Fig. 120. Effects of the unit step function: (A) Given function. 
(B) Switching off and on. (C) Shift. 
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THEOREM 1 


PROOF 


a 4 

! J 

{4} 

es tae) ij A _f\ 
= — 0 2 4 6 8 10 


-k F 
(A) kiu(t — 1) — 2u(t — 4) + u(t - 6)] (B) 4 sin Gre)[u(t) —u(t— 2) +u(t-4)-+-] 


Fig. 121. Use of many unit step functions. 


Time Shifting (t-Shifting): Replacing t by t — a inf(t) 


The first shifting theorem (“‘s-shifting”) in Sec. 6.1 concerned transforms F(s) = L{f(1)} 
and F(s — a) = £{ e“ f(t) }. The second shifting theorem will concern functions f(f) and 
f(t — a). Unit step functions are just tools, and the theorem will be needed to apply them 
in connection with any other functions. 


Second Shifting Theorem; Time Shifting 
If f(t) has the transform F(s), then the “shifted function” 


0 ift<a 


(3) FO = ft — aut — a) = { 
ft-a ift>a 


has the transform e “*F(s). That is, if £{f()} = F(s), then 
(4) Lf(t — aut — a)} = e “F(s). 
Or, if we take the inverse on both sides, we can write 


(4*) f(t — aut — a) = L71'{e7~SF(s)}. 


Practically speaking, if we know F(s), we can obtain the transform of (3) by multiplying 
F(s) by e “*. In Fig. 120, the transform of 5 sin tis F(s) = 5/(s” + 1), hence the shifted 
function 5 sin (tf — 2)u(t — 2) shown in Fig. 120(C) has the transform 


e *SF(s) = 5e775/(s? + 1). 


We prove Theorem 1. In (4), on the right, we use the definition of the Laplace transform, 
writing 7 for t (to have t available later). Then, taking e “ inside the integral, we have 


e “F(s) = «| e f(t) dt = | eR Der) de, 
0 0 


Substituting 7 + a = ft, thus 7 = t — a, dt = dt in the integral (CAUTION, the lower 


limit changes!), we obtain 


e “Fs) = | e “*f(t — a) dt. 


a 
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EXAMPLE 1 


CHAP. 6 Laplace Transforms 


To make the right side into a Laplace transform, we must have an integral from 0 to ™, 
not from a to °%. But this is easy. We multiply the integrand by u(t — a). Then for ¢ from 
0 to a the integrand is 0, and we can write, with f as in (3), 


«o 


e F(t) dt. 


e SF(s) = | e “f(t — a)u(t — a) a= | 


0 0 
(Do you now see why u(t — a) appears?) This integral is the left side of (4), the Laplace 
transform of f(f) in (3). This completes the proof. | 


Application of Theorem 1. Use of Unit Step Functions 


Write the following function using unit step functions and find its transform. 


2 if0<1t<1 
fO=Sat ifl<t<da (Fig. 122) 
cost if t> nr. 


Solution. Step I. 1m terms of unit step functions, 


f(t) = 20. — u(t — 1) + $2(ut — 1) — w(t — §a0)) + (cos u(t — 477). 


Indeed, 2(1 — u(t — 1)) gives f(‘) for 0 < t < 1, and so on. 


Step 2. To apply Theorem |, we must write each term in f(t) in the form f(t — a)u(t — a). Thus, 2(1 — u(t — 1)) 
remains as it is and gives the transform 2(1 — e *)/s. Then 


{+ Pu v} #2 y2+¢-) tS 


als) ea 


se{ (cos f)u (: 
Together, 


2 2 Ie OT «ll 1 «aim 1 
g£ -s'4 tH = as]? 
cc ss (33 s* =) (3 2s” we s+] 


If the conversion of f(t) to f(t — a) is inconvenient, replace it by 


(4) L{f(Quet — a)} = e CL FE + a)}. 


(4**) follows from (4) by writing f(t — a) = g(t), hence f(f) = g(t + a) and then again writing f for g. Thus, 


1 1 1 1 Io. 1 
{5 Pu vf ere{ ia ve} ele? +t | o(s tat =) 


as before. Similarly for Li 17 ul = a7)}. Finally, by (4**), 


1 it ) 
se{cos (1 = +7) = {cos (: + +7) = e 75/28 (sin t} e 7/2, ; 
2 2 sot] 
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F(t) 
2 


Fig. 122. f(t) in Example 1 


EXAMPLE 2. Application of Both Shifting Theorems. Inverse Transform 


Find the inverse transform f(¢) of 


e 
F(s) 


Solution. Without the exponential functions in the numerator the three terms of F(s) would have the inverses 
(sin 771)/77, (sin 77f)/77, and te~** because 1/s” has the inverse 1, so that 1/(s + 2) has the inverse te~*’ by the 
first shifting theorem in Sec. 6.1. Hence by the second shifting theorem (t-shifting), 


, 1. 1. -2(t-3) 
Sf 7 sin (w(t — 1)) u(t — 1) 4 7 sin (a(t — 2)) u(t — 2) + (t — 3)e u(t — 3). 


Now sin (wt — 7) = —sin 7t and sin (77t — 277) = sin 7t, so that the first and second terms cancel each other 
when ¢> 2. Hence we obtain f() =0 if O<t<1,—-(sinad/7 if 1<t<2, 0 if 2<r<3, and 
(t — 3)e 2*-® if t > 3. See Fig. 123. is] 

03°F 

02 - 

O.1F- 

(e) | | 
0 1 2 3 4 5 6 t 


Fig. 123. f(t) in Example 2 


EXAMPLE 3_ Response of an RC-Circuit to a Single Rectangular Wave 


Find the current i(f) in the RC-circuit in Fig. 124 if a single rectangular wave with voltage Vo is applied. The 
circuit is assumed to be quiescent before the wave is applied. 


Solution. The input is Vo[u(t — a) — u(t — b)]. Hence the circuit is modeled by the integro-differential 
equation (see Sec. 2.9 and Fig. 124) 


q(t) it 

Ri(t) 4 Ri(t) 4 | i(t) dr = v(t) = Vout — a) — ult — b)). 
Cc ap 
v(t) i(t) 


v(t) 


Fig. 124. RC-circuit, electromotive force v(t), and current in Example 3 
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EXAMPLE 4 


CHAP. 6 Laplace Transforms 


Using Theorem 3 in Sec. 6.2 and formula (1) in this section, we obtain the subsidiary equation 


I(s VY 
RI(s) + se [e7% — eS). 
sC Ss 


Solving this equation algebraically for J(s), we get 


VoIR Yo 
Ks) = F(s(e~® — e7 8 where F(s) = —————_ and PUK) =— e VRBO, 
(s) (s)( ) ) s+ 1/RO (F) : 


the last expression being obtained from Table 6.1 in Sec. 6.1. Hence Theorem | yields the solution (Fig. 124) 


Vo 
i(t) = LM" = $~ {eS F(s)} — $7 e7F(s)} = = [et A/ ROG = g) — eH t-BYROy — b)]; 


that is, i(f) = Oif t < a, and 


Kye (RO ifa<t<b 
it) = 
(Ky — Kye RO ifa>b 
where K; = Yoe/ZOrR and Ky = Yoe/FO/R. | 


Response of an RLC-Circuit to a Sinusoidal Input Acting Over a Time Interval 


Find the response (the current) of the RLC-circuit in Fig. 125, where E(f) is sinusoidal, acting for a short time 
interval only, say, 


E(t) = 100sin400t if0<t<2m7 and E(t) =O0ift>27 


and current and charge are initially zero. 


Solution. The electromotive force E(t) can be represented by (100 sin 400¢)(1 — u(t — 27r)). Hence the 
model for the current i(f) in the circuit is the integro-differential equation (see Sec. 2.9) 


t 
0.1 + Lit 100 | i(r) dr = (100 sin 4002)(1 — u(t — 277)). i(0) =0, i’(0) =0. 
0 


From Theorems 2 and 3 in Sec. 6.2 we obtain the subsidiary equation for [(s) = L(i) 


nee ere ee (2 —_ 
re} s  s*+4007\s s : 


Solving it algebraically and noting that s? + 110s + 1000 = (s + 10)(s + 100), we obtain 


1000 - 400 s se 27s 
l(s) ( ; 


(s + 10)(s + 100) \s2 + 4007-5 + 4002 


For the first term in the parentheses (---) times the factor in front of them we use the partial fraction 
expansion 


400,000s A B _ Dst+kK 


(s + 10)(s + 100)(s2 + 4002) s +10 ° s+ 100° s% + 400?" 


Now determine A, B, D, K by your favorite method or by a CAS or as follows. Multiplication by the common 
denominator gives 


400,000s = A(s + 100)(s* + 4007) + B(s + 10)(s2 + 400%) + (Ds + K)(s + 10)(s + 100). 
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We set s = —10 and —100 and then equate the sums of the s° and s terms to zero, obtaining (all values rounded) 
(s = —10) —4,000,000 = 90(107 + 4007)A, A = —0.27760 
(s = —100) — —40,000,000 = —90(1007 + 4007)B, B= 2.6144 
(s?-terms) 0O=A+B+D, D = —2.3368 
(s?-terms) 0=100A+10B+110D+K, K= 258.66. 


Since K = 258.66 = 0.6467 - 400, we thus obtain for the first term J, in J = J, — Ip 


0.2776 2.6144 2.33685, 0.6467 - 400 
s+10°s+100 s2+4002° 52 +400? ° 


1 


From Table 6.1 in Sec. 6.1 we see that its inverse is 

i,(f) = —0.2776e7 1 + 2.6144e~ 20 — 2.3368 cos 400t + 0.6467 sin 400F. 
This is the current i(t) when 0 < t < 277. It agrees for 0 < tf < 277 with that in Example | of Sec. 2.9 (except 
for notation), which concerned the same RLC-circuit. Its graph in Fig. 63 in Sec. 2.9 shows that the exponential 
terms decrease very rapidly. Note that the present amount of work was substantially less. 

The second term /; of / differs from the first term by the factor e278 Since cos 400(t — 277) = cos 400r 
and sin 400(t — 27r) = sin 400f, the second shifting theorem (Theorem 1) gives the inverse jg(t) = 0 if 
0 <t< 27, and for > 277 it gives 

in(t) = —0.2776e7 1*-2™ 4. 2.6144e7 1008-2 _ 9 3368 cos 400¢ + 0.6467 sin 4007. 
Hence in i(t) the cosine and sine terms cancel, and the current for t > 277 is 


i(t) = —0.2776(e72% — e7 10-2) + 9.6 144(e7100¢ — 9 1000-27) 


It goes to zero very rapidly, practically within 0.5 sec. i) 


C=107F 


R=11Q L=0.1H 


E(t) 
Fig. 125. RLC-circuit in Example 4 


PROBLEM SET 6.3 


1. Report on Shifting Theorems. Explain and compare 
the different roles of the two shifting theorems, using your 
own formulations and simple examples. Give no proofs. 


2-11| SECOND SHIFTING THEOREM, 
UNIT STEP FUNCTION 
Sketch or graph the given function, which is assumed to be 


zero outside the given interval. Represent it, using unit step 
functions. Find its transform. Show the details of your work. 


270 <4 =D) 3. t-2(t> 2) 
4. cos 4t (0 < t < 7) 5. e (0<1t< 7/2) 


6. sin Wt (2 <t <4) Te ™(2<t<A4) 
8. 2(1 <t<2) 9. 12 (¢ > 3) 
10. sinh t (0 < t < 2) 11. sint (7/2 <t< 7) 


12-17| INVERSE TRANSFORMS BY THE 
2ND SHIFTING THEOREM 
Find and sketch or graph f(t) if £(f) equals 
12. e~35/(s — 1)3 13. 6(1 — e~7)/(s? + 9) 
14, 4(e75 — 2e75)/s 15. e**/s# 
16. 2(e7* — e735)/(s? — 4) 
17. (1 +e ?7S* PVs + I/(s + 1)? + 1) 


224 CHAP. 6 Laplace Transforms 


18-27 


IVPs, SOME WITH DISCONTINUOUS 
INPUT 

Using the Laplace transform and showing the details, solve 
18. 9y” — 6y’ +y=0, yO) = 3, y'0) = 1 


31. Discharge in RC-circuit. Using the Laplace transform, 
find the charge q(t) on the capacitor of capacitance C 
in Fig. 127 if the capacitor is charged so that its potential 
is Vo and the switch is closed at t = 0. 


19. y” + 6y' + By =e *—e™™, yO) = 0, y'(0) = 0 
20. y” + 10y’ + 24y = 14427, (0) = 19/12, 
y'(0) = —-5 
21. y” + Sy = 8sintif0O<t< 7 andOift> 7; 
y(0) = 0, y'(0) = 4 
22. y" + 3y’ + 2y = 4tifO<t< land 8ifr>1; 
y(0) = 0, y'(0) = 0 
23. y” + y' — 2y = 3sint — costif 0 <t< 27 and 
3 sin 2t — cos 2rift > 2m; y(0) = 1, y'(0) =0 
24, y"” + 3y’ + 2y = 1ifO<t< 1 andOift>1; 
y(0) = 0, y'(0) = 0 
25. y" +y=rifO<t<landOifr>1; 
y'(0) =0 
26. Shifted data. y"” + 2y’ + 5y = 10sintifO <t< 27 
and Oif t > 2a; y(a) = 1, y'(7) = 2e7-7 — 2 
27. Shifted data. y"” + 4y = 87? if 0<1<5 and 0 if 
t>5; yd) =1+4cos2, y'(1) = 4 — 2sin2 


y(0) = 0, 


28-40 | MODELS OF ELECTRIC CIRCUITS 


28-30 | RL-CIRCUIT 


Using the Laplace transform and showing the details, find 

the current i(f) in the circuit in Fig. 126, assuming i(0) = 0 

and: 

28. R= 1kQO (=1000 0), L = 1H,v =0if0O<t<7, 
and 40 sint Vif t > 7 

29. R =250,L=0.1H,v =490e*V if 0O<t<1 
and Oift> 1 

30. R= 100,L =0.5H,v = 200r V if 0 <t< 2 and0 
ift>2 


u(t) 


Fig. 126. Problems 28-30 


Cc 


LL 


Fig. 127. Problem 31 


32-34| RC-CIRCUIT 


Using the Laplace transform and showing the details, find 
the current i(f) in the circuit in Fig. 128 with R = 10 Q and 
C= 10°? F, where the current at tf = 0 is assumed to be 
zero, and: 


32. v = Oift <4 and 14-108 * Vifr>4 
33. v = Oift< 2 and 100(¢ — 2) Vifr > 2 


34. v(t) = 100 V if 0.5 < t < 0.6 and 0 otherwise. Why 
does i(f) have jumps? 


v(t) 


Fig. 128. Problems 32-34 


35-37 | LC-CIRCUIT 


Using the Laplace transform and showing the details, find 
the current i(f) in the circuit in Fig. 129, assuming zero 
initial current and charge on the capacitor and: 


35. L= 1H, C = 10°2F, v = —9900 cos t V if 
7 <t< 37 and O otherwise 

36. L = 1H, C = 0.25F, v = 200(t — 41°) Vif 
O<rt<tlandOifr>1 


37. L=0.5H, C= 0.05F, v = 78sintV if0<t<7 
and 0 if t > 7 


v(t) 


Fig. 129. Problems 35-37 


RLC-CIRCUIT 


Using the Laplace transform and showing the details, find 
the current i(f) in the circuit in Fig. 130, assuming zero 
initial current and charge and: 


38. R=40,L=1H,C=0.05F, vu = 34e ‘FV if 
O0<t<4and0ifr>4 
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39. R=20,L=1H,C=0.5F, vi) = 1kV if 40. R=20,L=1H,C=0.1F,v = 255 sintV 
0O<t<2and0ifr>2 if0 <t< 27 and 0 if t > 277 
Cc 
a : 
u(t) 
Fig. 130. Problems 38-40 Fig. 131. Current in Problem 40 


6.4 Short Impulses. Dirac’s Delta Function. 
Partial Fractions 


An airplane making a “hard” landing, a mechanical system being hit by a hammerblow, 
a ship being hit by a single high wave, a tennis ball being hit by a racket, and many other 
similar examples appear in everyday life. They are phenomena of an impulsive nature 
where actions of forces—mechanical, electrical, etc-—are applied over short intervals 
of time. 

We can model such phenomena and problems by “Dirac’s delta function,” and solve 
them very effecively by the Laplace transform. 

To model situations of that type, we consider the function 


i/k ifaStSatk 
(1) f(t — a) = { (Fig. 132) 


0 otherwise 


(and later its limit as k 0). This function represents, for instance, a force of magnitude 
1/k acting from t = a to t= a +k, where k is positive and small. In mechanics, the 
integral of a force acting over a time interval a= t=a+k is called the impulse of 
the force; similarly for electromotive forces E(f) acting on circuits. Since the blue rectangle 
in Fig. 132 has area 1, the impulse of fj, in (1) is 


atk 


(2) n= | fylt aai= | wt = 1, 


0 a 


aat+k t 


Fig. 132. The function f,(t — a) in (1) 
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To find out what will happen if k becomes smaller and smaller, we take the limit of f,, 
as k—>0 (k > 0). This limit is denoted by 6(t — a), that is, 


6(t — a) = lim f(t — a). 


6(t — a) is called the Dirac delta function? or the unit impulse function. 

6(f — a) is not a function in the ordinary sense as used in calculus, but a so-called 
generalized function.” To see this, we note that the impulse Jj, of f, is 1, so that from (1) 
and (2) by taking the limit as k > 0 we obtain 


o ift=a - 
(3) O(t — a) = and | O(t — a) dt = 1, 
0 otherwise 0 


but from calculus we know that a function which is everywhere 0 except at a single point 
must have the integral equal to 0. Nevertheless, in impulse problems, it is convenient to 
operate on 6(¢ — a) as though it were an ordinary function. In particular, for a continuous 
function g(t) one uses the property [often called the sifting property of 5(¢ — a), not to 
be confused with shifting] 


(4) | g(t)d(t — a) dt = g(a) 
0 


which is plausible by (2). 
To obtain the Laplace transform of 6(t — a), we write 


1 
Sx(t — a) = k [u(t — a) — u(t — (a + k))] 
and take the transform [see (2)] 
= = a. -as _ ,—(a+k)s} _ ,—as 
Life- a} = 7 [e* =e ]=e 


We now take the limit as k ~ 0. By I’H6pital’s rule the quotient on the right has the limit 
1 (differentiate the numerator and the denominator separately with respect to k, obtaining 
se~*S and s, respectively, and use se—*S/ s—1as k—0O). Hence the right side has the 
limit e~“. This suggests defining the transform of 6(t — a) by this limit, that is, 


(5) L{8(t-—ap=e™. 


The unit step and unit impulse functions can now be used on the right side of ODEs 
modeling mechanical or electrical systems, as we illustrate next. 


2PAUL DIRAC (1902-1984), English physicist, was awarded the Nobel Prize [jointly with the Austrian 
ERWIN SCHRODINGER (1887-1961)] in 1933 for his work in quantum mechanics. 

Generalized functions are also called distributions. Their theory was created in 1936 by the Russian 
mathematician SERGEI L’ VOVICH SOBOLEV (1908-1989), and in 1945, under wider aspects, by the French 
mathematician LAURENT SCHWARTZ (1915-2002). 
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EXAMPLE 1 Mass—Spring System Under a Square Wave 


Determine the response of the damped mass-spring system (see Sec. 2.8) under a square wave, modeled by 
(see Fig. 133) 


y" + 3y +2y =r =ut—1)-ut-2, y0)=0, yO) =0. 


Solution. From (1) and (2) in Sec. 6.2 and (2) and (4) in this section we obtain the subsidiary equation 


1 1 
2 —s —2s . —s —2s 
s°Y + 3s¥Y + 2Y e e ; Solution Ys e ; 
Ss te ) (s) s(s2 + 36h 2) ( ) 
Using the notation F(s) and partial fractions, we obtain 
1 1 
1 1 2 1 2 


F(s) 3 + 
sis +35+2) s(s+1)(s+2) gs stl gs+2 


From Table 6.1 in Sec. 6.1, we see that the inverse is 


fO= LF =5-e* + he, 


Therefore, by Theorem | in Sec. 6.3 (shifting) we obtain the square-wave response shown in Fig. 133, 
y = £7 F(s)e™* — F(sye~*5) 
f(t — uct — 1) — f(t — 2)uct — 2) 


0 (OO<t< 1) 
- 1 — eT t-D 4 1,-2¢-D (l<t<2) 
eo t-D n eo t- 2 | de -2-D _ de 20-2) (> 2). | | 
y(t) 
lr ———S 
! | 
| | 
| | 
OS = | 
| 
| 
! | 
() l ! | 
0 1 2 3 4 t 


Fig. 133. Square wave and response in Example 1 


EXAMPLE 2  Hammerblow Response of a Mass—Spring System 


Find the response of the system in Example | with the square wave replaced by a unit impulse at time t = 1. 
Solution. We now have the ODE and the subsidiary equation 
y" + 3y’ + 2y = &(t — 1), and (s2 + 35+ 2¥=e°8. 


Solving algebraically gives 


By Theorem | the inverse is 


0 ifO<r<1 


y(t) = LY) = 
e7t-D _ 2-2t-D ig rote 
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y(t) is shown in Fig. 134. Can you imagine how Fig. 133 approaches Fig. 134 as the wave becomes shorter and 
shorter, the area of the rectangle remaining 1? a] 


y(t) 
0.2- 


0.1- 


| I 
% 1 3 5 t 


Fig. 134. Response to a hammerblow in Example 2 


EXAMPLE 3. Four-Terminal RLC-Network 


Find the output voltage response in Fig. 135 if R = 200, L = 1H, C = 10~*F, the input is 8(¢) (a unit impulse 
at time f = 0), and current and charge are zero at time t = 0. 


Solution. To understand what is going on, note that the network is an RLC-circuit to which two wires at A 
and B are attached for recording the voltage u(t) on the capacitor. Recalling from Sec. 2.9 that current i(f) and 
charge q(t) are related by i = q’ = dq/dt, we obtain the model 


Li’ + Rit : Lq" + Rq' 4 q” + 20g’ + 10,000g = &(t). 


From (1) and (2) in Sec. 6.2 and (5) in this section we obtain the subsidiary equation for Q(s) = L(q) 


1 


(s? + 20s + 10,000)0 = 1. ~— Solution ==9Q = —————_—_.. 
(s + 10)? + 9900 


By the first shifting theorem in Sec. 6.1 we obtain from Q damped oscillations for g and v; rounding 9900 ~ 99.50”, 
we get (Fig. 135) 


1 q 
q= £1) = —— ¢7 } gin 99.50t and v = — = 100.5e7 sin 99.50t. HB 
99.50 C 
6(t) v 
R L 
é 
A B 
v(t) =? 
Network Voltage on the capacitor 


Fig. 135. Network and output voltage in Example 3 


More on Partial Fractions 


We have seen that the solution Y of a subsidiary equation usually appears as a quotient 
of polynomials Y(s) = F(s)/G(s), so that a partial fraction representation leads to a sum 
of expressions whose inverses we can obtain from a table, aided by the first shifting 
theorem (Sec. 6.1). These representations are sometimes called Heaviside expansions. 
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EXAMPLE 4 


An unrepeated factor s — a in G(s) requires a single partial fraction A/(s — a). 
See Examples | and 2. Repeated real factors (s — a)”, (s — a)’, ete., require partial 
fractions 


Az Ay Az Az Ay 


~ , + + : 
(s-—a)* s-a (s-—a> (s—a*® s-a 


etc., 


The inverses are (Agt + Ay)e“, (GAgt? + Aot + Aye™, ete. 

Unrepeated complex factors (s — a)(s — a),a = a + iB, a = a — if, require a partial 
fraction (As + B)/[(s — a)? + B?]. For an application, see Example 4 in Sec. 6.3. 
A further one is the following. 


Unrepeated Complex Factors. Damped Forced Vibrations 
Solve the initial value problem for a damped mass-spring system acted upon by a sinusoidal force for some 
time interval (Fig. 136), 

y" + 2y' + 2y=r(t), r(t) = 10sin2rifO<t<mwand0ift>a7; y0)=1, y'()=—5. 


Solution. From Table 6.1, (1), (2) in Sec. 6.2, and the second shifting theorem in Sec. 6.3, we obtain the 
subsidiary equation 


2 
(s2¥ — » + 5) + 2sY — 1) + 2¥ = 10 (1 — e779). 
set4 


We collect the Y-terms, (s? + 2s + 2)Y, take —s + 5 — 2 = —s + 3 to the right, and solve, 


20 20e~7* s—3 
(6) ¥ { 


(s2 + 4y(s2 +25 +2) (92 + 4)(s2 +2542) 5? £2542. 


For the last fraction we get from Table 6.1 and the first shifting theorem 


1-4 
(7) eat =e (cost —4sina). 
(s+ 12+1 


In the first fraction in (6) we have unrepeated complex roots, hence a partial fraction representation 


20 As +B Ms + N 
(s2 + 4s? +2542) st?+4  s2+ 2942 


Multiplication by the common denominator gives 


20 = (As + B)(s® + 2s + 2) + (Ms + NV(s? + 4). 
We determine A, B, M, N. Equating the coefficients of each power of s on both sides gives the four equations 
(a) [s*]: 0=A+M (b) [s2]: O0=2A+B+N 
(c) [s]: 0=2A+2B+4M  (d) [s°]: 20 =2B+4N. 


We can solve this, for instance, obtaining M = —A from (a), then A = B from (c), then N = —3A from (b), 
and finally A = —2 from (d). Hence A 2, B 2, M = 2, N = 6, and the first fraction in (6) has the 
representation 


—-2s-2 2%stl)+6-2 = 
(8) + . Inverse transform: —2 cos 2t — sin2t+e “(2cost + 4sinf). 
si+4 (s+)? +1 
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The sum of this inverse and (7) is the solution of the problem for 0 < tf < 7, namely (the sines cancel), 


(9) 


In the second fraction in (6), taken with the minus sign, we have the factor e~ 


y(t) = 3e~* cos t — 2 cos 2t — sin 2r 


if0 << 7. 


7S so that from (8) and the second 


shifting theorem (Sec. 6.3) we get the inverse transform of this fraction for tf > 0 in the form 


+2 cos (2t 


27r) + sin (2t 


277) 


e~*™ [2 cos (t — @) + 4sin(t — 7] 


= 2cos 2t + sin2t + e *"™ (2cost + 4sin A. 


The sum of this and (9) is the solution for t > 77, 


(10) 


y(t) = e~[(3 + 2e”) cos t + 4e” sin ft] 


ift> 7. 


Figure 136 shows (9) (for 0 < t < 7) and (10) (for t > 7), a beginning vibration, which goes to zero rapidly 


because of the damping and the absence of a driving force after t = 77. 


A 


Driving force | 7 


Mechanical system 


y = 0 (Equilibrium 


position) 


Dashpot (damping) 


Fig. 136. 


y(t) 


Output (solution) 


Example 4 


The case of repeated complex factors [(s — a)(s — ar: which is important in connection 
with resonance, will be handled by “convolution” in the next section. 


CAS PROJECT. Effect of Damping. Consider a 
vibrating system of your choice modeled by 


y" + cy’ + ky = 60). 


(a) Using graphs of the solution, describe the effect of 
continuously decreasing the damping to 0, keeping k 
constant. 


(b) What happens if c is kept constant and k is 
continuously increased, starting from 0? 

(c) Extend your results to a system with two 
6-functions on the right, acting at different times. 


. CAS EXPERIMENT. Limit of a Rectangular Wave. 


Effects of Impulse. 


(a) In Example | in the text, take a rectangular wave 
of area 1 from | to 1 + k. Graph the responses for a 
sequence of values of k approaching zero, illustrating 
that for smaller and smaller k those curves approach 


PROBLEM SET 6.4 


the curve shown in Fig. 134. Hint: If your CAS gives 
no solution for the differential equation, involving k, 
take specific k’s from the beginning. 


(b) Experiment on the response of the ODE in Example 
1 (or of another ODE of your choice) to an impulse 
6(t — a) for various systematically chosen a (> 0); 
choose initial conditions y(0) # 0, y’(0) = 0. Also con- 
sider the solution if no impulse is applied. Is there a 
dependence of the response on a? On b if you choose 
b8&(t — a)? Would —6(t — a) witha > a annihilate the 
effect of 6(t — a)? Can you think of other questions that 
one could consider experimentally by inspecting graphs? 


3-12 


EFFECT OF DELTA (IMPULSE) 


ON VIBRATING SYSTEMS 


Find and graph or sketch the solution of the IVP. Show the 
details. 


3. 


y" + 4y = 6-77), y(0) = 8, y'(0) =0 
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yu 


a 


- 


12. 


13. 


14. 


y" + loy = 46(t — 377), y(0) = 2, y'(0) = 0 
y" +y = &(t — 1) — &(t — 2m), 
y(0) = 0, y'(0) = 1 
y" + 4y’ + 5y = &— 1), (0) = 0, y') = 3 
4y” + 24y’ + 37y = 17e * + S(t — 4), 
y0) = 1, y'@=1 
y" + 3y’ + 2y = 10(sint + &(t — 1), yO) = 1, 
y'(0) = -1 
"+ dy’ + 5y =[1 — u(t — 10)]e* — e!8(t — 10), 
y(0) = 0, y’(0) = 
y" + 5y’ + 6y = S(t dar) + u(t — 7) cos ft, 
y(0) = 0, y’(0) = 0 
y" + 5y’ + 6y = u(t — 1) + &(t — 2), 
y(0) = 0, y’(0) = 
y" + 2y' + Sy = 25t — 1008(t — ar), (0) = —2, 
y'(0) =5 
PROJECT. Heaviside Formulas. (a) Show that for 


a simple root a and fraction A/(s — a) in F(s)/G(s) we 
have the Heaviside formula 

(s — a)F(s) 

sa G(s) 


(b) Similarly, show that for a root a of order m and 
fractions in 


F(s) 
G(s) 


Am 


(s — a)” 


Am-1 


(s —a)™1 


1 . 
aI a + further fractions 


we have the Heaviside formulas for the first coefficient 


(s — a) F(s) 
Am = lim 
sa G(s) 


and for the other coefficients 


A, = 


d@—* f(s — a)" F(s) 
ae} 
k=1,-:-,m-—1. 


lim 
(m — k)! soa ds™—* 


TEAM PROJECT. Laplace Transform of Periodic 
Functions 


(a) Theorem. The Laplace transform of a piecewise 
continuous function f(t) with period p is 


p 
AG) = —=| e f(t) dt (s > 0). 
i =2 


0 


(11) 


Prove this theorem. Hint: Write [y= Sp of, a pores, 


Set t = (n — 1)p in the nth integral. Take out e~~ P? 


from under the integral sign. Use the sum formula for 
the geometric series. 

(b) Half-wave rectifier. Using (11), show that the 
half-wave rectification of sin wt in Fig. 137 has the 
Laplace transform 


w(1 + e77/”) 


Lf) = 


(s2 + w2)\(1 — e7278/) 
@ 


(s7 + w)\(1 — e777”) 
(A half-wave rectifier clips the negative portions of the 
curve. A full-wave rectifier converts them to positive; 
see Fig. 138.) 


(c) Full-wave rectifier. Show that the Laplace trans- 
form of the full-wave rectification of sin wt is 


i Ss 
coth —. 
s? + @ 2w 


0 tla 2r/lo 3/0 t 


Fig. 137. Half-wave rectification 


32/0 t 


Full-wave rectification 


0 tio 2n/lo 


Fig. 138. 


(d) Saw-tooth wave. Find the Laplace transform of the 
saw-tooth wave in Fig. 139. 


WA 


P 
Fig. 139. 


Saw-tooth wave 


15. Staircase function. Find the Laplace transform of the 
staircase function in Fig. 140 by noting that it is the 
difference of kt/p and the function in 14(d). 


f(t) 


Fig. 140. Staircase function 
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6.5 Convolution. Integral Equations 


THEOREM -1 


EXAMPLE 1 


EXAMPLE 2 


Convolution has to do with the multiplication of transforms. The situation is as follows. 
Addition of transforms provides no problem; we know that £(f + g) = L(f) + L(g). 
Now multiplication of transforms occurs frequently in connection with ODEs, integral 
equations, and elsewhere. Then we usually know £( f) and L(g) and would like to know 
the function whose transform is the product LC f)L(g). We might perhaps guess that it 
is fg, but this is false. The transform of a product is generally different from the product 
of the transforms of the factors, 


L£( fg) # L(f)L(g) in general. 


To see this take f = e’ and g = 1. Thenfg = e’, £( fg) = 1/(s — 1), but £(f) = 1/(s — 1) 
and £(1) = I/s give L(f)L(g) = 1/(s” — s). 

According to the next theorem, the correct answer is that £( f)L(g) is the transform of 
the convolution of fand g, denoted by the standard notation f * g and defined by the integral 


t 


(1) AQ) = (Ff #9) = ly (r)g(t — 7) dr. 


0 


Convolution Theorem 


If two functions f and g satisfy the assumption in the existence theorem in Sec. 6.1, 
so that their transforms F and G exist, the product H = FG is the transform of h 
given by (1). (Proof after Example 2.) 


Convolution 
Let H(s) = 1/[(s — a)s]. Find h(a). 
Solution. 1/(s — a) has the inverse f(t) = e, and 1/s has the inverse g(t) = 1. With f(r) = e and 


g(t — T) = 1 we thus obtain from (1) the answer 


t 
1 
h(t) =e“ #1 = [ e. 1dr =—(e™ — 1). 
0 


To check, calculate 


Le“) L(A). 1] 


1/1 1\ 1 
H(s) = LA\(s) ( ) = 
a\s—a 


Convolution 
Let H(s) = 1/(s? + w?)*. Find h(a). 
Solution. The inverse of 1/(s? ap w”) is (sin wt)/@. Hence from (1) and the first formula in (11) in App. 3.1 


we obtain 


: : t 
sin wf , sin wt 1 . : 
* x | sin wt sin w(t — 7) dr 

o wo J, 


A(t) 


t 
1 
= al [—cos wt + cos (2wt — wt)] dt 
2w 0 
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PROOF 


1 sin wt |t 
T COS wt + 
2 7T=0 
1 sin wt 
tcos wt + 
20 
in agreement with formula 21 in the table in Sec. 6.9. il 


We prove the Convolution Theorem 1. CAUTION! Note which ones are the variables 
of integration! We can denote them as we want, for instance, by 7 and p, and write 


F(s) = | e f(t) dt and G(s) = | e *P9(p) dp. 
0 0 


We now set t = p + 7, where 7 is at first constant. Then p = ft — 7, and ¢ varies from 
T to ©, Thus 


G(s) = | eS Pet — 7) dt = e | e*'g(t — 7) di. 


T 


7 in F and ¢ in G vary independently. Hence we can insert the G-integral into the 
F-integral. Cancellation of e~*” and e*” then gives 


-) 


F(s)G(s) = | eye” | e “g(t — r)dt dr = | so) | e g(t — r)dt dr. 


0 T 0 T 


Here we integrate for fixed 7 over ¢ from 7 to © and then over 7 from 0 to ~. This is the 
blue region in Fig. 141. Under the assumption on f and g the order of integration can be 
reversed (see Ref. [A5] for a proof using uniform convergence). We then integrate first 
over T from 0 to t and then over t from 0 to ™, that is, 


oo 


oo t 
F(s)G(s) = | ae | f(r)g(t — 7) dr dt = | e~*h(t)dt = L(h) = H(s). 


0 0 0 


This completes the proof. a 


t 


Fig. 141. Region of integration in the 
tr-plane in the proof of Theorem 1 
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From the definition it follows almost immediately that convolution has the properties 


ftg=a*f (commutative law) 

f* (g, + 92) =f*g, + f* ge (distributive law) 

(f*g)*v =f*(e *v) (associative law) 
f*0=02f7=0 


similar to those of the multiplication of numbers. However, there are differences of which 
you should be aware. 


EXAMPLE 3 _ Unusual Properties of Convolution 


f * 1 # fin general. For instance, 


t 
1 5 
tet= | retar=ie # t. 
i 2 


(f *f)(t) 2 0 may not hold. For instance, Example 2 with w = | gives 


sint * sint = —htcos to 3 sin t (Fig. 142). 


Fig. 142. Example 3 


We shall now take up the case of a complex double root (left aside in the last section in 
connection with partial fractions) and find the solution (the inverse transform) directly by 
convolution. 


EXAMPLE 4 _ Repeated Complex Factors. Resonance 
In an undamped mass-spring system, resonance occurs if the frequency of the driving force equals the natural 
frequency of the system. Then the model is (see Sec. 2.8) 


y” + wey = Ksinwot 


where w2 = k/m, k is the spring constant, and m is the mass of the body attached to the spring. We assume 
y(0) = 0 and y’(0) = 0, for simplicity. Then the subsidiary equation is 


2 2 Kwo Koo 
s“Y + woY = ——. Its solution is i eT EE 
s+ we (s? + wo)? 
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EXAMPLE 5 


This is a transform as in Example 2 with w = wo and multiplied by Kwo. Hence from Example 2 we can see 
directly that the solution of our problem is 


x (—Wot COs Wot + sin of). 
wo 


Kwo sin wot K 
y(t) z t cos Wot 4 
2009 Wo 2 


We see that the first term grows without bound. Clearly, in the case of resonance such a term must occur. (See 
also a similar kind of solution in Fig. 55 in Sec. 2.8.) =] 


Application to Nonhomogeneous Linear ODEs 


Nonhomogeneous linear ODEs can now be solved by a general method based on 
convolution by which the solution is obtained in the form of an integral. To see this, recall 
from Sec. 6.2 that the subsidiary equation of the ODE 


(2) y” + ay’ + by= rt) (a, b constant) 
has the solution [(7) in Sec. 6.2] 


¥(s) = [(s + ay(0) + y'O)]Q(s) + R)Q(s) 


with R(s) = £(r) and Q(s) = 1/ (s? + as + b) the transfer function. Inversion of the first 
term [---] provides no difficulty; depending on whether qa” — b is positive, zero, or 
negative, its inverse will be a linear combination of two exponential functions, or of the 
form (cy + cote 0 2 ora damped oscillation, respectively. The interesting term is 
R(s)Q(s) because r(t) can have various forms of practical importance, as we shall see. If 
y(O) = O and y'(0) = 0, then Y = RQ, and the convolution theorem gives the solution 


t 
(3) yd) = | q(t — T)r(r) dr. 


0 


Response of a Damped Vibrating System to a Single Square Wave 
Using convolution, determine the response of the damped mass-—spring system modeled by 
y” + 3y’ + 2y = rin), r(t) = 1 if 1 <t< 2 and 0 otherwise, y(0) = y'(0) = 0. 


This system with an input (a driving force) that acts for some time only (Fig. 143) has been solved by partial 
fraction reduction in Sec. 6.4 (Example 1). 


Solution by Convolution. The transfer function and its inverse are 


OAs) = 3 : hence qt) = e* — et 
s 


Hence the convolution integral (3) is (except for the limits of integration) 


1 
y(t) = [ac _ 7) -ldr= ie _ e 7e—™) d= et”) _ oc 


Now comes an important point in handling convolution. r(7) = 1 if 1 <7 < 2 only. Hence if t < 1, the integral 
is zero. If | < t < 2, we have to integrate from + = | (not 0) to ¢. This gives (with the first two terms from the 
upper limit) 


1-0 -G@-) _ 1,-2¢-D, _ 1 
(e xe )=9 


y(t) e? xe e t-D + de -2E-D, 
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If t > 2, we have to integrate from 7 = | to 2 (not to f). This gives 


— ,-¢-2) _ 1,-2¢-2) -(¢t-1 1 -2(t-1) 
y(t) =e —5e =(¢ _ xe »: 


Figure 143 shows the input (the square wave) and the interesting output, which is zero from 0 to 1, then increases, 
reaches a maximum (near 2.6) after the input has become zero (why?), and finally decreases to zero in a monotone 


fashion. a 
y(t) 
1 >—————_ 
| | 
| | 
0.5- I Output (response) 
i | ra 
| 
ie 
L ! i 
% 1 2 3 4 t 


Fig. 143. Square wave and response in Example 5 


Integral Equations 


Convolution also helps in solving certain integral equations, that is, equations in which the 
unknown function y(f) appears in an integral (and perhaps also outside of it). This concerns 
equations with an integral of the form of a convolution. Hence these are special and it suffices 
to explain the idea in terms of two examples and add a few problems in the problem set. 


EXAMPLE 6 _ A Volterra Integral Equation of the Second Kind 
Solve the Volterra integral equation of the second kind? 
t 
y(t) — | y(t) sin (t — 7) dr = t. 
0 


Solution. From (1) we see that the given equation can be written as a convolution, y — y * sin t = ¢. Writing 
Y = L(y) and applying the convolution theorem, we obtain 


W- mang 4 
s s s 
s+] +1 3 
The solution is 
(s) = ri = 2 a and gives the answer yt) =t e 


Check the result by a CAS or by substitution and repeated integration by parts (which will need patience). Ml 


EXAMPLE 7 Another Volterra Integral Equation of the Second Kind 


Solve the Volterra integral equation 


t 
yo) | (1 + 7)y(t — 7) dt = 1 — sinht. 
0 


5If the upper limit of integration is variable, the equation is named after the Italian mathematician VITO 
VOLTERRA (1860-1940), and if that limit is constant, the equation is named after the Swedish mathematician 
ERIK IVAR FREDHOLM (1866-1927). “Of the second kind (first kind)” indicates that y occurs (does not 
occur) outside of the integral. 


SEC. 6.5 Convolution. Integral Equations 237 


Solution. By (1) we can write y — (1 + )* y = 1 — sinh¢. Writing Y = L(y), we obtain by using the 
convolution theorem and then taking common denominators 


seal (++) 1 
ot -Cr alle; 


f-L 


(2-5 - 1)/s cancels on both sides, so that solving for Y simply gives 


Y(s) = 


st-1 


CONVOLUTIONS BY INTEGRATION 
Find: 
lll 


3. eb ee 


2. 1 * sin wt 
. 4. (cos wf) * (cos wf) 
5. (sin wf) * (cos wf) 6. e@ «(a # b) 


Tree 


8-14| INTEGRAL EQUATIONS 
Solve by the Laplace transform, showing the details: 


t 
8. y(t) + s| y(t \(t — 7) dr = 2t 
0 


t 
9. y(t) — | y(t) dt = 1 
0 
t 


10. y(t) — y(7) sin 2(t — 7) dt = sin 2t 
0 
t 
11. ya) + | (t — t)y(7) drt = 1 
0 
t 
12. y(t) + y(t) cosh (t — 7) dt =t + ée 


o 


t 
13. y(t) + 2 | y(r)e~* dr = tet 
0 


t 
14, y(t) — | y(t)(t — tT) dr = 2 —- x 
0 

15. CAS EXPERIMENT. Variation of a Parameter. 
(a) Replace 2 in Prob. 13 by a parameter k and 
investigate graphically how the solution curve changes 
if you vary k, in particular near k = —2. 
(b) Make similar experiments with an integral equation 
of your choice whose solution is oscillating. 


sy-s-1 s*-1-5 
hence Y(s) - 3 3 i 
S s(s~ — 1) 
and the solution is y(t) = cosh ¢. a 


PROBLEEM—SET-6-5 


16. TEAM PROJECT. Properties of Convolution. Prove: 
(a) Commutativity, f* g = g*f 
(b) Associativity, (f* g) *u = f* (g *v) 
(c) Distributivity, f* (gy + go) = f* 97 + f* go 
(d) Dirac’s delta. Derive the sifting formula (4) in Sec. 
6.4 by using fj, with a = 0 [(1), Sec. 6.4] and applying 
the mean value theorem for integrals. 


(e) Unspecified driving force. Show that forced 
vibrations governed by 


y" +o°y =r), yO) = Ki, y') = Ke 


with w # 0 and an unspecified driving force r(t) 
can be written in convolution form, 


le Ke 
y= @ Sn at * + Kycos wt + “p Sin ot. 


17-26; INVERSE TRANSFORMS 
BY CONVOLUTION 


Showing details, find f(A) if LC f) equals: 


5.5 1 
17. — 18. 
(s + 1.5)(s = 4) (s — ay’ 
19 27s 20 9 
"(52 + a7)? " s(s + 3) 
ra) e * 
21. == 22. 
s*(s7 + w) s(s — 2) 
40. 24 
73, —H05_ 24, 240 
s(s* — 9) (s* + 1)(s* + 25) 
1 
25. ue 
(s“ + 36) 


26. Partial Fractions. Solve Probs. 17, 21, and 23 by 
partial fraction reduction. 
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6.6 Differentiation and Integration of Transforms. 
ODEs with Variable Coefficients 


EXAMPLE 1 


The variety of methods for obtaining transforms and inverse transforms and _ their 
application in solving ODEs is surprisingly large. We have seen that they include direct 
integration, the use of linearity (Sec. 6.1), shifting (Secs. 6.1, 6.3), convolution (Sec. 6.5), 
and differentiation and integration of functions f(f) (Sec. 6.2). In this section, we shall 
consider operations of somewhat lesser importance. They are the differentiation and 
integration of transforms F(s) and corresponding operations for functions f(t). We show 
how they are applied to ODEs with variable coefficients. 


Differentiation of Transforms 


It can be shown that, if a function f(‘) satisfies the conditions of the existence theorem in 
Sec. 6.1, then the derivative F’(s) = dF, /ds of the transform F(s) = L( f) can be obtained 
by differentiating /(s) under the integral sign with respect to s (proof in Ref. [GenRef4] 
listed in App. 1). Thus, if 


F(s) = | e f(t) dt, then F(s) = —- | e “tf (t) dt. 


0 0 
Consequently, if £(f) = F(s), then 
(1) L{tf(O} = —F'(s), hence LF (s)} = -f 0 


where the second formula is obtained by applying £ ~! on both sides of the first formula. 
In this way, differentiation of the transform of a function corresponds to the multiplication 
of the function by —t. 

Differentiation of Transforms. Formulas 21—23 in Sec. 6.9 


We shall derive the following three formulas. 


L(f) S(O) 
1 1 
2 (sin Bt t t 
(2) G24 1 sin Bt — Bt cos Bt) 
Ss L .. 
(3) as PP 3p sin Bt 
(4) * : (sin Bt + Bt cos Br) 
(s2 + B22 2B 


Solution. From (1) and formula 8 (with w = B) in Table 6.1 of Sec. 6.1 we obtain by differentiation 
(CAUTION! Chain rule!) 


2Bs 


L(t sin Bt) = —————_.. 
B (s2 + By 
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EXAMPLE 2 


Dividing by 26 and using the linearity of £, we obtain (3). 
Formulas (2) and (4) are obtained as follows. From (1) and formula 7 (with w = B) in Table 6.1 we find 


(s2 + B) — 252 2 — p? 


(92 + B22 (s2 ent B22 


(5) L£(t cos Bt) 


From this and formula 8 (with w = B) in Table 6.1 we have 


ee Be | 1 
(s2 + pr? ~ 2 + pe 


a+ cos Bt + : sin pr) 
B 


On the right we now take the common denominator. Then we see that for the plus sign the numerator becomes 
se B +s? + B = 2s", so that (4) follows by division by 2. Similarly, for the minus sign the numerator 
takes the form s2 — 8? — s2 — B? = —2?, and we obtain (2). This agrees with Example 2 in Sec. 6.5. 


Integration of Transforms 


Similarly, if f(¢) satisfies the conditions of the existence theorem in Sec. 6.1 and the limit 
of f(t)/t, as t approaches 0 from the right, exists, then for s > k, 


(6) | = | F(S) ds hence oan | F(3) as} oe 


In this way, integration of the transform of a function f(t) corresponds to the division of 


f(t) by t. 


We indicate how (6) is obtained. From the definition it follows that 


| F(3) ds = | | | Fp atlas, 
Ss Ss 0 


and it can be shown (see Ref. [GenRef4] in App. 1) that under the above assumptions we 
may reverse the order of integration, that is, 


| Fe) as = | | Fy ds | d= | rc] | Fas) 
Ss 0 s 0 Ss 


Integration of e~** with respect to 5 gives ey (—1t). Here the integral over s on the right 
equals ey t. Therefore, 


| F(3)ds = | ea = «| (s>k. @ 
Ss 0 


Differentiation and Integration of Transforms 


2 2 2 
Find the inverse transform of In (: + =) =In : 
Ss 


Solution. Denote the given transform by F(s). Its derivative is 


28 25 


d 
F'(s) (in (s? + w”) — In a) 
ds rte s 
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Taking the inverse transform and using (1), we obtain 


2s 


st + 0 


L-{F'(s)} = at =| 2 cos wt — 2 if(t). 
Ss 


Hence the inverse f(t) of F(s) is f(t) = 21 — cos wt)/t. This agrees with formula 42 in Sec. 6.9. 


Alternatively, if we let 


2s 


st + 0 S 


G(s) = then g(t) = £-\(G) — 2Acos wt — 1). 


From this and (6) we get, in agreement with the answer just obtained, 


2+ a : a(t) 
fin! —_ \ = | G(s) as} = a = =a — cos af), 
s 


the minus occurring since s is the lower limit of integration. 
In a similar way we obtain formula 43 in Sec. 6.9, 


2 
2. 
ofin(1 = =) = 7 — cosh at). o 
s 


Special Linear ODEs with Variable Coefficients 


Formula (1) can be used to solve certain ODEs with variable coefficients. The idea is this. 
Let £(y) = Y. Then L(y’) = sY — y(0) (see Sec. 6.2). Hence by (1), 


dY 
ds” 


2h -s0O1=-¥=% 


(7) L(ty') = rs 


Similarly, £(y"”) = s2¥ — sy(0) — y’(0) and by (1) 


4 1.2y — sy(0) — y'(0)] = -25Y - 5? + yo, 


® LH")= - a 


Hence if an ODE has coefficients such as at + b, the subsidiary equation is a first-order 
ODE for Y, which is sometimes simpler than the given second-order ODE. But if the latter 
has coefficients at? + bt + c, then two applications of (1) would give a second-order 
ODE for Y, and this shows that the present method works well only for rather special 
ODEs with variable coefficients. An important ODE for which the method is advantageous 
is the following. 


Laguerre’s Equation. Laguerre Polynomials 


Laguerre’s ODE is 
(9) ty” +(1— dy’ + ny =0. 


We determine a solution of (9) with n = 0, 1, 2,---. From (7)-(9) we get the subsidiary equation 


dy dY 
2sY ie + 10) + s¥ — y(0) ( ¥ 2) + nY = 0. 
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Simplification gives 


(s s?) t+ (n+1-—s)¥=0. 


Separating variables, using partial fractions, integrating (with the constant of integration taken to be zero), and 
taking exponentials, we get 


dY n+1-s n n+1 
(10*) = ds ds and 
Y = 8 ae | S 


(s — 1)” 
Y=— 


grtt 


We write 1, = gh Y) and prove Rodrigues’s formula 


t n 


Ip(t) = * : (t"e~*) 
nt) = — —_ (t"e™), 
dt 


n! n 


(10) lo = 1, n=1,2,-. 


These are polynomials because the exponential terms cancel if we perform the indicated differentiations. They 
are called Laguerre polynomials and are usually denoted by L,, (see Problem Set 5.7, but we continue to reserve 
capital letters for transforms). We prove (10). By Table 6.1 and the first shifting theorem (s-shifting), 


! d” Is” 
Lt"e 4) = a, hence by (3) in Sec. 6.2 of = wren} Ss 
(s+1)"*1 dt” (s+ prt 


because the derivatives up to the order n — | are zero at 0. Now make another shift and divide by n! to get 
[see (10) and then (10*)] 


= 1)" 
gaj=8— > =x |_| 


PROBLEM SET 6-6 


1. REVIEW REPORT. Differentiation and Integration 
of Functions and Transforms. Make a draft of these 
four operations from memory. Then compare your draft 
with the text and write a 2- to 3-page report on these 
operations and their significance in applications. 


TRANSFORMS BY DIFFERENTIATION 
Showing the details of your work, find £( f) if f(A) equals: 
2. 3t sinh 4t 
3. xte** 
4. te~' cos t 
5. tf cos wt 
6. 17 sin 3¢ 
7. t7 cosh 2t 
8. te~** sin t 
9, 5t? sin mt 
10. re! 
11. 4t cos amt 
12. CAS PROJECT. Laguerre Polynomials. (a) Write a 
CAS program for finding /,,(t) in explicit form from (10). 
Apply it to calculate Jo,---, 149. Verify that lo,---, /10 
satisfy Laguerre’s differential equation (9). 


13 


(b) Show that 


n —1)” n 
L»(d) = > ( en 


m=0 m! m 
and calculate /9,---, /49 from this formula. 
(c) Calculate Jo, ---, 149 recursively from /p = 1, 14 = 
1 — tby 


(n+ Ding = (2n +1 —Dly — npr. 


(d) A generating function (definition in Problem Set 
5.2) for the Laguerre polynomials is 

ey L,(t)x” =(1- x) belt/@—D 
=0 


n= 


Obtain /o,-+-, /19 from the corresponding partial sum 
of this power series in x and compare the /,, with those 
in (a), (b), or (c). 


CAS EXPERIMENT. Laguerre Polynomials. Ex- 
periment with the graphs of /o,---,/19, finding out 
empirically how the first maximum, first minimum, --- 
is moving with respect to its location as a function of 
n. Write a short report on this. 
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14-20} INVERSE TRANSFORMS 


Using differentiation, integration, s-shifting, or convolution, 
and showing the details, find f(1) if £(f) equals: 


—— 
"(52 + 16)2 
S 


15. ——_—~ 
(s — 9)? 


6./ Systems of ODEs 


2s + 6 


16. ———______ 
(s2 + 65 + 10)” 


Ss Ss 
17. In ee 18. arccot 
2 
sot + 
iia int 
17 stb 


The Laplace transform method may also be used for solving systems of ODEs, as we shall 
explain in terms of typical applications. We consider a first-order linear system with 
constant coefficients (as discussed in Sec. 4.1) 


, 
J1 


() 


ro 
y2 


Writing 4% = L(y), % = L(y2), 


the subsidiary system 


= a4191 + ay2Y2 + gy(t) 


421¥1 + Ag2yo + got). 


G, = £(g1), Go = L(ge), we obtain from (1) in Sec. 6.2 


s¥ — yO) = ayi% + a2 + Gy(s) 
s¥ — yo(0) = do1% + dogk + Gols). 


By collecting the 4- and ¥%-terms we have 


(441 — K+ 


(2) 


421% 


ay2% = —y (0) — Gy(s) 


+ (dg2 — 5)¥2 = —ya(0) — Gols). 


By solving this system algebraically for ¥j(s),¥(s) and taking the inverse transform we 
obtain the solution yy = 21K), vo = LOS) of the given system (1). 

Note that (1) and (2) may be written in vector form (and similarly for the systems in 
the examples); thus, setting y = [1 yal", A = [ax], ¢ = [gi gol", Y=([¥% x)" 


G = [G, G2]" we have 


y =Aytg 


EXAMPLE 1 


and (A 


sDY = —y(0) — G. 


Mixing Problem Involving Two Tanks 


Tank 7, in Fig. 144 initially contains 100 gal of pure water. Tank 7 initially contains 100 gal of water in which 
150 lb of salt are dissolved. The inflow into 7, is 2 gal/min from 72 and 6 gal/min containing 6 Ib of salt from 
the outside. The inflow into 73 is 8 gal/min from Tj. The outflow from 72 is 2 + 6 = 8 gal/min, as shown in 
the figure. The mixtures are kept uniform by stirring. Find and plot the salt contents y(t) and yo(f) in 7, and 


Th, respectively. 
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Solution. The model is obtained in the form of two equations 


Time rate of change = Inflow/min — Outflow/min 
for the two tanks (see Sec. 4.1). Thus, 


rg i2e fe Bea Be 
Y1 = — 00.1 + too y2 + 6. y2 = 100.1 — 100y2- 


The initial conditions are y;(0) = 0, yo(0) = 150. From this we see that the subsidiary system (2) is 


6 
(-0.08-sK+ 002% =-~ 
0.08% + (—0.08 — s)% = —150. 


We solve this algebraically for % and ¥% by elimination (or by Cramer’s rule in Sec. 7.7), and we write the 
solutions in terms of partial fractions, 


¥ 9s + 0.48 100 62.5 Sea) 
s(s + 0.12)(s + 0.04) Ss s+0.12 s+ 0.04 

; 150s? + 125 + 0.48 — 100 , 125 ia. 
s(s + 0.12)(s + 0.04) Ss s+0.12 s + 004 


By taking the inverse transform we arrive at the solution 


yy = 100 — 62.5e791%4 — 37.5¢7 0.0% 
UO + 12567" = 75a" BO, 


2, 


Figure 144 shows the interesting plot of these functions. Can you give physical explanations for their main 
features? Why do they have the limit 100? Why is yg not monotone, whereas y, is? Why is y, from some time 


on suddenly larger than yo? Etc. | 
6 gal/min y(e) 
a> 150 
of Lids Salt content in 7, 
100 


8 gal/min 
———=S 


50}. Salt content in 7, 


| ! L L 
50 100 150 200 t 


Usa 


6 gal/min 


Fig. 144. Mixing problem in Example 1 


Other systems of ODEs of practical importance can be solved by the Laplace transform 
method in a similar way, and eigenvalues and eigenvectors, as we had to determine them 
in Chap. 4, will come out automatically, as we have seen in Example 1. 


Electrical Network 


Find the currents i,(f) and i9(f) in the network in Fig. 145 with L and R measured in terms of the usual units 
(see Sec. 2.9), v(t) = 100 volts if 0 S t S$ 0.5 sec and 0 thereafter, and i(0) = 0, i’(0) = 0. 


Solution. The model of the network is obtained from Kirchhoff’s Voltage Law as in Sec. 2.9. For the lower 
circuit we obtain 


0.81, + I(iy — ig) + 1.41, = 100[1 — u(r — 4) 
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i(t) 
30 


L 
Oo O05 1 15 2 25 3 ¢ 
Currents 


Network 


Fig. 145. Electrical network in Example 2 


and for the upper 
1+ is + (ig — 14) = 0. 
Division by 0.8 and ordering gives for the lower circuit 


i + 3i, — 1.25ig = 125[1 — u(t — })] 


and for the upper 
ig —iy + in = 0. 


With 7,(0) = 0, i2(0) = 0 we obtain from (1) in Sec. 6.2 and the second shifting theorem the subsidiary 
system 


1 e3/2 
(s + 3), — 1.25h5 125(4 - ) 
-ht+(st+ Db = 0. 


Solving algebraically for J; and Jy gives 


125(s + 1) - 
qh qi eS), 
s(s + a)(s + 3) 
125 = 
Ip : eile /2) 
s(s + g)(s + 3) 


The right sides, without the factor 1 — e */?. have the partial fraction expansions 


500 125 625 
7s 3(s +3) 9 21s + 3) 


and 


500 250 250 


 te+h sie+d) 


respectively. The inverse transform of this gives the solution forO0 StS Z, 


: 125 -t/2 _ 625 —7t/2 , 500 
ix(t) 25 9 t/ oT € 12 4 7 


250 ,-t/2 , 250 -7t/2 , 500 
ze + sre + 9 


ig(t) 
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According to the second shifting theorem the solution for t > Z is 14(t) — ix(t — 3) and io(t) — io(t — 4), that is, 


i,(t) = -15 (4 = eM/4yo-H/2 S257 _ e7/4) 1/2 
250 250 (t > 9). 
ig(t) = —3°U — eV 4ye—V/2 4 Sd — et e— 11/2 


Can you explain physically why both currents eventually go to zero, and why i;(¢) has a sharp cusp whereas 
ig(t) has a continuous tangent direction at tf = 39 | 


Systems of ODEs of higher order can be solved by the Laplace transform method in a 
similar fashion. As an important application, typical of many similar mechanical systems, 
we consider coupled vibrating masses on springs. 


k 
= = m,=1 
nat 
k 
‘TT m,=1 
y 
; k 
Fig. 146. Example 3 


Model of Two Masses on Springs (Fig. 146) 


The mechanical system in Fig. 146 consists of two bodies of mass | on three springs of the same spring constant 
k and of negligibly small masses of the springs. Also damping is assumed to be practically zero. Then the model 
of the physical system is the system of ODEs 


yt = —kyr + kOe — yd) 
(3) " 
yo = —kQ2 — yi) — kyo. 


Here y, and yg are the displacements of the bodies from their positions of static equilibrium. These ODEs 
follow from Newton’s second law, Mass X Acceleration = Force, as in Sec. 2.4 for a single body. We again 
regard downward forces as positive and upward as negative. On the upper body, —ky, is the force of the 
upper spring and k(yg — y,) that of the middle spring, yg — y, being the net change in spring length—think 
this over before going on. On the lower body, —k(y2 — yy) is the force of the middle spring and —kyg that 
of the lower spring. 

We shall determine the solution corresponding to the initial conditions y;(0) = 1, ye(0) = 1, y4(0) V3k, 
yo(0) = —V3k. Let ¥ = L(y) and % = L(y2). Then from (2) in Sec. 6.2 and the initial conditions we obtain 
the subsidiary system 


52% —s — V3k = -ky + ki — %) 
s°% —s + V3k = —k(% — %) — kM. 


This system of linear algebraic equations in the unknowns Y% and ¥ may be written 


=5+ V3k 
V3k. 


(s7+2k)%—- k% 


ky. + (874 


2k)% =s 
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Elimination (or Cramer’s rule in Sec. 7.7) yields the solution, which we can expand in terms of partial fractions, 


2. 


(s + V3k\(s2 + 2k) + ks — V3) s | V3k 
(s? + 2k)? — k? tk st 43k 

__ (62 + 2s — V3K) + Ks + V3K) sg V3k 
(s? + 2k)? — Kk? Stk st4+3k 


Hence the solution of our initial value problem is (Fig. 147) 


yi) = £7'(K) = cos 
yo(t) = £7'(%) = cos 


kt + sin V3kt 
kt — sin V 3kt. 


We see that the motion of each mass is harmonic (the system is undamped!), being the superposition of a “slow” 


oscillation and a “rapid” oscillation. 


Fig. 147. Solutions in Example 3 


PROBLEM SET 6.7 


1. TEAM PROJECT. Comparison of Methods for 
Linear Systems of ODEs 
(a) Models. Solve the models in Examples 1 and 2 of 
Sec. 4.1 by Laplace transforms and compare the amount 
of work with that in Sec. 4.1. Show the details of your 
work. 
(b) Homogeneous Systems. Solve the systems (8), 
(11)-C13) in Sec. 4.3 by Laplace transforms. Show the 
details. 
(c) Nonhomogeneous System. Solve the system (3) in 
Sec. 4.6 by Laplace transforms. Show the details. 


2-15| SYSTEMS OF ODES 


Using the Laplace transform and showing the details of 
your work, solve the IVP: 


2 y, ty. =0, y, + yo = 2cost, 
yi0) = 1, ya(0) = 0 

3. yi = —yi + 4ye, ya = 3y1 — 2ye, 
y1(0) = 3, ye(0) = 4 

4. yj = 4yq — 8cos4t, ys = —3y, — 9 sin 44, 
y10) = 0, ya(0) = 3 


5. 


yi =yet1—u(t—1), yo yy +1 — u(t — 1, 


y1(0) = 0, y2(0) = 0 


! 1 ! _— | 
1 = Sy1 + ye, Yo = y1 + Sye, 
yi(0) = 1,  y2(0) = —3 
s yl = 2y, — 4ye + u(t Le’, 


yo = y1 — 3y2 + u(t — De’, yi) = 3, yo(0) = 0 


8. yj = —2y1 + 3y2, yo = 491 — Yas 
y1(0) = 4, yo(0) = 3 

9. yi =4y1 + ye, yo = —y1 + 2ye, y1(0) = 3, 
yo(0) = 1 

10. yj = —yo, yo = —y, + 2[1 — u(t — 277) cos ¢, 
y1(0) = 1, yo(0) = 0 

11. yt =y1 + 3y2, ys = 491 — 4e’, 


12. 


13. 


yi(0) = 2, yi) = 3, yo(0) = 1, ya(0) = 2 


yt = —2y, + 272, yg = 2y1 — 5yo, 

yi) = 1, (0) =0, yo(0) = 3, y2(0) = 0 
yt + yo = —101 sin 104, yo + yy = 101 sin 102, 
yi) = 0, yi) = 6, yo(0) = 8, y3(0) = -6 
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14. 


15. 


4yi + yz — 2yg = 0, —2y7 + y3 = 1, 
2y3 — 4y3 = —16t 

y1(0) = 2, ye(0) = 0, y3(0) = 0 

YI + yg = 2 sinht, yy + y3 =e, 

yg + yi = 2eh +e, yx(0) = 1, yo(0) = 1, 
y3(0) = 0 


FURTHER APPLICATIONS 


16. Forced vibrations of two masses. Solve the model in 


17. 


18. 


19. 


Example 3 with k = 4 and initial conditions y,(0) = 1, 
y1(0) = 1, yo(0) = 1, ys = —1 under the assumption 
that the force 11 sin fis acting on the first body and the 
force —11sin¢ on the second. Graph the two curves 
on common axes and explain the motion physically. 


CAS Experiment. Effect of Initial Conditions. In 
Prob. 16, vary the initial conditions systematically, 
describe and explain the graphs physically. The great 
variety of curves will surprise you. Are they always 
periodic? Can you find empirical laws for the changes 
in terms of continuous changes of those conditions? 


Mixing problem. What will happen in Example 1 if 
you double all flows (in particular, an increase to 
12 gal/min containing 12 Ib of salt from the outside), 
leaving the size of the tanks and the initial conditions 
as before? First guess, then calculate. Can you relate 
the new solution to the old one? 


Electrical network. Using Laplace transforms, 
find the currents i,(f) and io(f) in Fig. 148, where 
v(t) = 390 cos t and 74(0) = 0, i9(0) = 0. How soon 
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will the currents practically reach their steady state? 


Network 


i) 


Currents 


Fig. 148. Electrical network and 
currents in Problem 19 


20. Single cosine wave. Solve Prob. 19 when the EMF 
(electromotive force) is acting from 0 to 277 only. Can 
you do this just by looking at Prob. 19, practically 
without calculation? 
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6.8 Laplace Transform: General Formulas 


Formula Name, Comments Sec. 
F(s) = LL f()} = | e“£(t) dt Definition of Transform 
? 6.1 
f@® = L{F(s)} Inverse Transform 
L{af(t) + bet} = akL{f} + bL{ gh} Linearity 6.1 
at = = 
ee i, s-Shifting 
LF (s -a}= e“F (1) (First Shifting Theorem) 6.1 
L£(f') = s£L(f) — fO) 
Lf") = s2?¥L( f) — sf) — f'(O) Differentiation of Function 
x F™) = gL f) = gD FO) Ss. bis 6.2 
eee ae f?-10) 
: 1 
{| f(t) ar} als LC f) Integration of Function 
0 
t 
(f*3)® = | revs — 1) dr 
i" 
= | f(t — T)g(7) dt Convolution 6.5 
0 
L(f*g) = L(f)L(g) 

L{f(t — a)u(t — a)} =e “F(s) Shifting ' 
Le SF(s)} = f(t — adult — a) (Second Shifting Theorem) , 
L{tf(t)} = —F'(s) Differentiation of Transform 

fi) ee 6.6 
| = | F(s)ds Integration of Transform 
p 6.4 
Lf) = | e ““f(t) dt f Periodic with Period p Project 


— pps 
l-e i 


16 


SEC. 6.9 Table of Laplace Transforms 249 
6.9 Table of Laplace Transforms 
For more extensive tables, see Ref. [A9] in Appendix 1. 
F(s) = £{fO} fo) Sec. 
1 1/s 1 
2 1/s? t 
3 is” @=1,2,-**) t"1/(n — 1)! ei 
4 1/Vs 1/Vat 
5 1/s?/? WWVi/T 
6 I/st* (a>0) f-UTe) 
7 1 ett 
sS—a 
8 d tet 
(s — a) 
1 1 6.1 
9 (n = 1,2,---) pr lat 
(s — a)” (n— 1)! 
1 1 k-1 at 
10 (k > 0) t’"e 
(s — ak Pk) 
1 1 
11 b at _ bt 
(s — a)(s — b) (a ) Gab eo) 
Ss 
12 b at _ pedt 
(s — a)(s — b) (a ) a ide oo 
13 : — sin wt 
eta 
14 : cos wt 
se +a" 
S: : Z sinh at 
52 ca, az a 
s 6.1 
16 cosh at 
s? — a 
17 : 1 at sinh wt 
(s — ay’ + ow ad 
18 — e” cos wt 
(s — ay’ + w 
19 : Eas — COS wf) 
2 2 2 
s(s~ + w”) 62 
20 : Ete — sin wf) 
s°(s2 + w?) 3 


(continued) 
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Table of Laplace Transforms (continued) 


F(s) = £{fO} fo) Sec. 
1 eee 
21 <a ae — (sin wt — wt COS wf) 
(s* + w*) 20 
S ts. 
22 (2 4+ w)? oa sin wt 6.6 
a 1 
23 ee —(sin wt + wt cos wt) 
(s? + w7)? 20 
24 2 (a b?) (cos at — cos bf) 
(s? + ays 2 + b?) b? — a? 
1 pes : 
25 aa — (sin kt cos kt — cos kt sinh kt) 
gs” + 4k 4k 
1 
26 | —-— — sin kt sinh ke 
gs” + 4k 2k 
1 ee ? 
27 ri 7 3 (sinh kt — sin kt) 
s'—k 2k 
si 1 
28 aI r 5 (cosh kt — cos kt) 
sok 2k 
29 | Vs—a-—Vs—b : ate") 
2V Tt 
VstaVs+b 2 
1 
31 Jo(at) J54 
s2 + a? 
32 1 at 42 
(s — a)?” Vai : a 
k-1/2 
1 Var ( t 
33 (k > 0) red Were Ty—1/2(at) [5:5 
(s? — a) T(k) \2a 
34 e /s u(t — a) 6.3 
35 es O(t — a) 6.4 
1 
26) =e" Jo(2 Vit) J5.4 
37 1 o-kys cos 2Vkt 
S Tt 
38 1 els | sinh 2 
s3/2 N ark 
39 oe kVs (k ss 0) k ek /4t 
2V mt? 


(continued) 
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Table of Laplace Transforms (continued) 
F(s) = £{fO} fO Sec. 
1 
40 a Ins —Int-—y (y ~ 0.5772) y 5.5 
41 | Ino—* Jd (got — eat) 
s='D t 
2 2 
+ 
42 In ae, . (1 — cos af) 6.6 
s? t 
2 2 
- 2 
43 In : 2 — (1 — cosh aft) 
s? t 
44 arctan = — sin wt 
S t 
1 . App. 
45 = arccot s Si(t) A3.1 


1. State the Laplace transforms of a few simple functions 
from memory. 

2. What are the steps of solving an ODE by the Laplace 
transform? 

3. In what cases of solving ODEs is the present method 
preferable to that in Chap. 2? 

4. What property of the Laplace transform is crucial in 
solving ODEs? 

5. Is L{fO + gO} = L{FO} + L{ gw}? 
L{FOsO} = L{FO}L{ gO}? Explain. 

6. When and how do you use the unit step function and 
Dirac’s delta? 

7. If you know f(t) = £-1{F(s)}, how would you find 
L1{ F(s)/s2}? 

8. Explain the use of the two shifting theorems from memory. 

9. Can a discontinuous function have a Laplace transform? 
Give reason. 

10. If two different continuous functions have transforms, 

the latter are different. Why is this practically important? 


11-19} LAPLACE TRANSFORMS 

Find the transform, indicating the method used and showing 
the details. 

11. 5 cosh 2t — 3 sinht 
13. sin? (771) 


12. e~"(cos 4t — 2 sin 41) 
14. 16¢7u(t — 4) 
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15. e/u(t — 3) 
17. tcost + sint 
19, 12t * e 3 


16. u(t — 277) sint 
18. (sin wf) * (cos wf) 


20-28 | INVERSE LAPLACE TRANSFORM 


Find the inverse transform, indicating the method used and 
showing the details: 


eS) ski 
20. ——___—- 21. e* 
s?—25-8 s 
1 
16 @cos@ + ssin@ 
22. PO sens de 23. ne: oe oe 
s tstg so +o 
s? — 6.25 6(s + 1) 
"(52 + 6.25)? ” 
25-1 +4 
16° a 
set 4s +5 
28. — 
sg" = 2s + 2 
29-37| ODEs AND SYSTEMS 


Solve by the Laplace transform, showing the details and 
graphing the solution: 


29, y” + 4y' + 5y = 504, yO) =5, y'(0) = —-5 
30. y" + l6y = 46(¢ -— 7), y(0)=—1, y'(0)=0 
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31. y" — y’ — 2y = 12u(t — a) sint, yO) = 1, 
y(0)=-1 

32. y” + 4y = &(t — 7) — S(t — 277), yO) = 1, 
y'(0) =0 

33. y” + 3y’ + 2y = 2u(t — 2), y(0)=0, y'(0) =0 

34. yi =ye, yo = —4y1 + 6(t— 7), yi) = 0, 
y2(0) = 0 

35. yi = 2y1 — 4y2, yo = y1 — 3y2, yr(0) = 3, 
y2(0) = 0 

36. yi = 2y1 + 4y2, yo =y1 + 2y2, y1(0) = 4, 
y2(0) = —4 

37. yj =yo + u(t -— 7), yo = —yy + u(t — 277), 
yi0) = 1, y2(0) = 0 

38-45 | MASS—SPRING SYSTEMS, CIRCUITS, 

NETWORKS 


Model and solve by the Laplace transform: 


38. 


39. 


40. 


41. 


Show that the model of the mechanical system in 
Fig. 149 (no friction, no damping) is 
miyl = —k1y1 + ke(ye — yi) 
mgyz = —k2(y2 — y1) — kya). 
6) ) 
| V1 | Vo 
«_ -— 
k ! k k 


Fig. 149. System in Problems 38 and 39 


In Prob. 38, let m1 = my = 10 kg,ky = kg = 20 kg/sec?, 
ko = 40 kg/ sec”. Find the solution satisfying the ini- 
tial conditions y,(0)=y2(0)=0, y1(0)=1 meter/sec, 
y3(0) = —1 meter/sec. 

Find the model (the system of ODEs) in Prob. 38 
extended by adding another mass mg and another spring 
of modulus k4q in series. 

Find the current i(t) in the RC-circuit in Fig. 150, 
whereR = 100, C = 0.1 F,v(@) = 10r Vif0O <t <4, 
v(t) = 40 V if t>4, and the initial charge on the 
capacitor is 0. 


R ] Cc 
v(t) 
Fig. 150. RC-circuit 


42. 


43. 


44. 


45. 


Find and graph the charge q(f) and the current i(f) in 
the LC-circuit in Fig. 151, assuming L = 1H, C = 1F, 
vf) =1-—e° if 0<t<7,v(t) =0 if t> 7, and 
zero initial current and charge. 

Find the current i(f) in the RLC-circuit in Fig. 152, where 
R= 160 QO, L = 20H, C = 0.002 F, v(¢) = 37 sin 101 V, 
and current and charge at f = 0 are zero. 


Cc | L R L 
v(t) v(t) 


Fig. 151. LC-circuit Fig. 152. RLC-circuit 


Show that, by Kirchhoff’s Voltage Law (Sec. 2.9), the 
currents in the network in Fig. 153 are obtained from 
the system 


Lit + R(iz — ig) = v() 
1 
R(ig — iy) + G2 =0. 


Solve this system, assuming that R= 100, L = 20H, 
C = 0.05 F,v = 20 V, i1(0) = 0, ig(0) = 2 A. 


Fig. 153. Network in Problem 44 


Set up the model of the network in Fig. 154 and find 
the solution, assuming that all charges and currents are 
0 when the switch is closed at t = 0. Find the limits of 
i1(t) and ig(t) as t— ™, (1) from the solution, (ii) directly 
from the given network. 


C=0.05 F 


Switch 
Fig. 154. Network in Problem 45 


R, = 300 
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SUMMARY—-OF CHAPTER 6 


Laplace Transforms 


The main purpose of Laplace transforms is the solution of differential equations and 
systems of such equations, as well as corresponding initial value problems. The 
Laplace transform F(s) = L(f) of a function f(A) is defined by 


(1) Fy = Lf) = | e f(t) dt (Sec. 6.1). 


0 


This definition is motivated by the property that the differentiation of f with respect 
to ¢ corresponds to the multiplication of the transform F by s; more precisely, 


. L(f') = s£(f) — f(0) (Sec. 6.2) 


Lf") = 8° Lf) — sf) - f'O) 
etc. Hence by taking the transform of a given differential equation 
(3) y” + ay’ + by = rt) (a, b constant) 
and writing £(y) = Y(s), we obtain the subsidiary equation 
(4) (s2 + as + b)Y = £(r) + sf(0) + f'(0) + af(0). 


Here, in obtaining the transform £(r) we can get help from the small table in Sec. 6.1 
or the larger table in Sec. 6.9. This is the first step. In the second step we solve the 
subsidiary equation algebraically for Y(s). In the third step we determine the inverse 
transform y(t) = gy), that is, the solution of the problem. This is generally 
the hardest step, and in it we may again use one of those two tables. Y(s) will often 
be a rational function, so that we can obtain the inverse £7 *(Y) by partial fraction 
reduction (Sec. 6.4) if we see no simpler way. 

The Laplace method avoids the determination of a general solution of the 
homogeneous ODE, and we also need not determine values of arbitrary constants 
in a general solution from initial conditions; instead, we can insert the latter directly 
into (4). Two further facts account for the practical importance of the Laplace 
transform. First, it has some basic properties and resulting techniques that simplify 
the determination of transforms and inverses. The most important of these properties 
are listed in Sec. 6.8, together with references to the corresponding sections. More 
on the use of unit step functions and Dirac’s delta can be found in Secs. 6.3 and 
6.4, and more on convolution in Sec. 6.5. Second, due to these properties, the present 
method is particularly suitable for handling right sides r(¢) given by different 
expressions over different intervals of time, for instance, when r(f) is a square wave 
or an impulse or of a form such as r(t) = cos t if 0 S t S 477 and 0 elsewhere. 

The application of the Laplace transform to systems of ODEs is shown in Sec. 6.7. 
(The application to PDEs follows in Sec. 12.12.) 


PART B 


Linear Algebra. 
Vector Calculus 


> 
in es] 


CHAPTER 7 Linear Algebra: Matrices, Vectors, Determinants. Linear Systems 
CHAPTER 8 Linear Algebra: Matrix Eigenvalue Problems 
CHAPTER 9 Vector Differential Calculus. Grad, Div, Curl 
CHAPTER 10 Vector Integral Calculus. Integral Theorems 


Matrices and vectors, which underlie linear algebra (Chaps. 7 and 8), allow us to represent 
numbers or functions in an ordered and compact form. Matrices can hold enormous amounts 
of data—think of a network of millions of computer connections or cell phone connections— 
in a form that can be rapidly processed by computers. The main topic of Chap. 7 is how 
to solve systems of linear equations using matrices. Concepts of rank, basis, linear 
transformations, and vector spaces are closely related. Chapter 8 deals with eigenvalue 
problems. Linear algebra is an active field that has many applications in engineering 
physics, numerics (see Chaps. 20-22), economics, and others. 


Chapters 9 and 10 extend calculus to vector calculus. We start with vectors from linear 
algebra and develop vector differential calculus. We differentiate functions of several 
variables and discuss vector differential operations such as grad, div, and curl. Chapter 10 
extends regular integration to integration over curves, surfaces, and solids, thereby 
obtaining new types of integrals. Ingenious theorems by Gauss, Green, and Stokes allow 
us to transform these integrals into one another. 


Software suitable for linear algebra (Lapack, Maple, Mathematica, Matlab) can be found 
in the list at the opening of Part E of the book if needed. 


Numeric linear algebra (Chap. 20) can be studied directly after Chap. 7 or 8 because 
Chap. 20 is independent of the other chapters in Part E on numerics. 
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CHAPTER 7 


Linear Algebra: Matrices, 
Vectors, Determinants. 
Linear Systems 


Linear algebra is a fairly extensive subject that covers vectors and matrices, determinants, 
systems of linear equations, vector spaces and linear transformations, eigenvalue problems, 
and other topics. As an area of study it has a broad appeal in that it has many applications 
in engineering, physics, geometry, computer science, economics, and other areas. It also 
contributes to a deeper understanding of mathematics itself. 

Matrices, which are rectangular arrays of numbers or functions, and vectors are the 
main tools of linear algebra. Matrices are important because they let us express large 
amounts of data and functions in an organized and concise form. Furthermore, since 
matrices are single objects, we denote them by single letters and calculate with them 
directly. All these features have made matrices and vectors very popular for expressing 
scientific and mathematical ideas. 

The chapter keeps a good mix between applications (electric networks, Markov 
processes, traffic flow, etc.) and theory. Chapter 7 is structured as follows: Sections 7.1 
and 7.2 provide an intuitive introduction to matrices and vectors and their operations, 
including matrix multiplication. The next block of sections, that is, Secs. 7.3-7.5 provide 
the most important method for solving systems of linear equations by the Gauss 
elimination method. This method is a cornerstone of linear algebra, and the method 
itself and variants of it appear in different areas of mathematics and in many applications. 
It leads to a consideration of the behavior of solutions and concepts such as rank of a 
matrix, linear independence, and bases. We shift to determinants, a topic that has 
declined in importance, in Secs. 7.6 and 7.7. Section 7.8 covers inverses of matrices. 
The chapter ends with vector spaces, inner product spaces, linear transformations, and 
composition of linear transformations. Eigenvalue problems follow in Chap. 8. 


COMMENT. Numeric linear algebra (Secs. 20.1-20.5) can be studied immediately 
after this chapter. 


Prerequisite: None. 
Sections that may be omitted in a short course: 7.5, 7.9. 
References and Answers to Problems: App. | Part B, and App. 2. 
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7.\ Matrices, Vectors: 
Addition and Scalar Multiplication 


EXAMPLE 1 


The basic concepts and rules of matrix and vector algebra are introduced in Secs. 7.1 and 
7.2 and are followed by linear systems (systems of linear equations), a main application, 
in Sec. 7.3. 

Let us first take a leisurely look at matrices before we formalize our discussion. A matrix 
is arectangular array of numbers or functions which we will enclose in brackets. For example, 


41 42 413 


0.3 1 =5 
r i U6 d21 422 23 |, 
(1) 431 432 433 
e* 2x? 4 
‘a : lay dz asi, ‘ 
e 4x 5 


are matrices. The numbers (or functions) are called entries or, less commonly, elements 
of the matrix. The first matrix in (1) has two rows, which are the horizontal lines of entries. 
Furthermore, it has three columns, which are the vertical lines of entries. The second and 
third matrices are square matrices, which means that each has as many rows as columns— 
3 and 2, respectively. The entries of the second matrix have two indices, signifying their 
location within the matrix. The first index is the number of the row and the second is the 
number of the column, so that together the entry’s position is uniquely identified. For 
example, dogg (read a two three) is in Row 2 and Column 3, etc. The notation is standard 
and applies to all matrices, including those that are not square. 

Matrices having just a single row or column are called vectors. Thus, the fourth matrix 
in (1) has just one row and is called a row vector. The last matrix in (1) has just one 
column and is called a column vector. Because the goal of the indexing of entries was 
to uniquely identify the position of an element within a matrix, one index suffices for 
vectors, whether they are row or column vectors. Thus, the third entry of the row vector 
in (1) is denoted by ag. 

Matrices are handy for storing and processing data in applications. Consider the 
following two common examples. 


Linear Systems, a Major Application of Matrices 
We are given a system of linear equations, briefly a linear system, such as 


4x, + 6x2 + 9x3 = 6 


6x1 — 2x3 = 20 


5x1 = 8xo TT 43> 10 


where x4, X9, X3 are the unknowns. We form the coefficient matrix, call it A, by listing the coefficients of the 
unknowns in the position in which they appear in the linear equations. In the second equation, there is no 
unknown x», which means that the coefficient of x2 is 0 and hence in matrix A, dg2 = 0, Thus, 
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EXAMPLE 2 
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A=|6 0 —2]|. We form another matrix A=] 6 0 -2 20 


by augmenting A with the right sides of the linear system and call it the augmented matrix of the system. 
Since we can go back and recapture the system of linear equations directly from the augmented matrix A,A 

contains all the information of the system and can thus be used to solve the linear system. This means that we 

can just use the augmented matrix to do the calculations needed to solve the system. We shall explain this in 


detail in Sec. 7.3. Meanwhile you may verify by substitution that the solution is x1 = 3,xg = x, %g=—1, 
The notation x1, x2, x3 for the unknowns is practical but not essential; we could choose x, y, z or some other 
letters. 


Sales Figures in Matrix Form 


Sales figures for three products I, I, III in a store on Monday (Mon), Tuesday (Tues),--: may for each week 
be arranged in a matrix 


Mon Tues Wed Thur Fri Sat Sun 


40 33 81 O 21 47 33 I 
A=] 0 12 78 50 50 96 90 |- I 


10 0 0 27 43 78 56 Il 


If the company has 10 stores, we can set up 10 such matrices, one for each store. Then, by adding corresponding 
entries of these matrices, we can get a matrix showing the total sales of each product on each day. Can you think 
of other data which can be stored in matrix form? For instance, in transportation or storage problems? Or in 
listing distances in a network of roads? | 


General Concepts and Notations 


Let us formalize what we just have discussed. We shall denote matrices by capital boldface 
letters A, B, C,---, or by writing the general entry in brackets; thus A = [aj], and so 
on. By an m X n matrix (read m by n matrix) we mean a matrix with m rows and n 
columns—rows always come first! m X n is called the size of the matrix. Thus anim X n 
matrix is of the form 


41 "2 iy Qn 

a21 a22, ps don 
(2) A= [ajx] = 

Gm1 4m2 °** 4mn 


The matrices in (1) are of sizes 2 X 3,3 X 3,2 X 2,1 X 3, and 2 X I, respectively. 

Each entry in (2) has two subscripts. The first is the row number and the second is the 
column number. Thus dg, is the entry in Row 2 and Column 1. 

If m = n, we call A ann X n square matrix. Then its diagonal containing the entries 
411, 422, ***, Gyn 18 called the main diagonal of A. Thus the main diagonals of the two 
square matrices in (1) are ay1, dag, dg3 and e ”, 4x, respectively. 

Square matrices are particularly important, as we shall see. A matrix of any size m X n 
is called a rectangular matrix; this includes square matrices as a special case. 
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EXAMPLE 3 
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Vectors 
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A vector is a matrix with only one row or column. Its entries are called the components 


of the vector. We shall denote vectors by lowercase boldface letters a, b,--- 


or by its 


general component in brackets, a = [a;], and so on. Our special vectors in (1) suggest 


that a (general) row vector is of the form 
a=[d) dg **' dy}. For instance, 


A column vector is of the form 


Addition and Scalar Multiplication 


of Matrices and Vectors 


a=[-2 5 08 0 


What makes matrices and vectors really useful and particularly suitable for computers is 
the fact that we can calculate with them almost as easily as with numbers. Indeed, we 
now introduce rules for addition and for scalar multiplication (multiplication by numbers) 
that were suggested by practical applications. (Multiplication of matrices by matrices 
follows in the next section.) We first need the concept of equality. 


Equality of Matrices 


of different sizes are always different. 


Two matrices A = [aj] and B = [Dj] are equal, written A = B, if and only if 
they have the same size and the corresponding entries are equal, that is, aj3 = by1, 
a12 = big, and so on. Matrices that are not equal are called different. Thus, matrices 


Equality of Matrices 


Let 
YN1 412 
_ and B= 
d21 deg 
Then 
ayy = 4, 
A=B if and only if 
dg, = 3, 


The following matrices are all different. Explain! 


iol led tea 


a2= 0, 


d22 = =L 
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DEFINITION 


EXAMPLE 4 


DEFINITION 


EXAMPLE 5 
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Addition of Matrices 


The sum of two matrices A = [a;,] and B = [bj] of the same size is written 
A + B and has the entries aj, + bj, obtained by adding the corresponding entries 
of A and B. Matrices of different sizes cannot be added. 


As a special case, the sum a + b of two row vectors or two column vectors, which 
must have the same number of components, is obtained by adding the corresponding 
components. 


Addition of Matrices and Vectors 


-4 6 3 Ss. =) “9 


If A= 


| and B= 


| then A+B= 


lc 5° -3 
2 2 2) 
A in Example 3 and our present A cannot be added. If a=[5 7 2] and b=[-6 2 OJ, then 


at+b=[-1 9 2]. 
An application of matrix addition was suggested in Example 2. Many others will follow. a 


0 1 2 3 1 0 


Scalar Multiplication (Multiplication by a Number) 


The product of any m X n matrix A = [aj] and any scalar c (number c) is written 
cA and is the m X n matrix cA = [caj;,] obtained by multiplying each entry of A 
by c. 


Here (—1)A is simply written —A and is called the negative of A. Similarly, (—k)A is 
written —kA. Also, A + (—B) is written A — B and is called the difference of A and B 
(which must have the same size!). 


Scalar Multiplication 


[2.7 -1.8] [-27 18] 3 -2| lo o| 
1 
If A=/0 0.9], thn -A=/| 0 -09], =|o0 1], oA=/0 Ol. 
19.0 —4.5 | 1-90 4.5 | 10-5 0 0 


If a matrix B shows the distances between some cities in miles, 1.609B gives these distances in kilometers. 


Rules for Matrix Addition and Scalar Multiplication. From the familiar laws for the 
addition of numbers we obtain similar laws for the addition of matrices of the same size 
m Xn, namely, 


(a) A+B=B+A 

(b) (A+B)+C=A+(B+C)_ (writenA+B+C) 
(3) 

(c) A+0=A 


(d) A + (—A) = 0. 


Here 0 denotes the zero matrix (of size m X n), that is, the m X n matrix with all entries 
zero. If m = 1 or n = 1, this is a vector, called a zero vector. 
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Hence matrix addition is commutative and associative [by (3a) and (3b)]. 
Similarly, for scalar multiplication we obtain the rules 


(a) 
(b) 
(c) 
(d) 


(4) 


c(A + B) = cA + cB 


(c+ khA=cA+ kA 


c({kA) = (ck)A (written ckA) 
1A =A. 


PROBLEM SET 7-1 


1-7 


GENERAL QUESTIONS 


1. Equality. Give reasons why the five matrices in 
Example 3 are all different. 


2. Double subscript notation. If you write the matrix in 
Example 2 in the form A = [aj], what is a3]? ay3? 


a26 ? 433! 


9 


3. Sizes. What sizes do the matrices in Examples 1, 2, 3, 
and 5 have? 

4. Main diagonal. What is the main diagonal of A in 
Example 1? Of A and B in Example 3? 

5. Scalar multiplication. If A in Example 2 shows the 
number of items sold, what is the matrix B of units sold 
if a unit consists of (a) 5 items and (b) 10 items? 


6. If a 12 X 12 matrix A shows the distances between 
12 cities in kilometers, how can you obtain from A the 
matrix B showing these distances in miles? 


7. Addition of vectors. Can you add: A row and 
a column vector with different numbers of compo- 
nents? With the same number of components? Two 
row vectors with the same number of components 


but 


different numbers of zeros? A vector and a 


scalar? A vector with four components anda2 x 2 
matrix? 


8-16 


Let 


ADDITION AND SCALAR 
MULTIPLICATION OF MATRICES 
AND VECTORS 


0 2 4 0 = 2 
6 5 5], B=| 5 3 4 
I! 0 -3 |-2 4-2 
[Ss 3 [-4 1 
—2 4|, D=!/ 5. Ol, 
oe | 2 -1 


ie 2 
E =| 3 4 
13-1 
ie | -1 | -5 
u= 0 |, v= |, w =| —30 
| -3.0 | 2 | 10 


Find the following expressions, indicating which of the 
tules in (3) or (4) they illustrate, or give reasons why they 
are not defined. 


18. 


19. 


. 24+ 4B, 4B+2A, 0A +B, 04B-—4.2A 
.3A, 0.5B, 34+ 0.5B, 3A +05B+C 

. (4-3)A, 4A), 14B— 3B, 11B 

. 8C + 10D, 2(5D+4C), 0.6C — 0.6D, 


0.6(C — D) 


~-(C+D)+E, (D+ E)+C, O(C — E) + 4D, 


A — 0C 


.(2:7NC, 270), -D+0E, E-D+C+t+u 
. (Su + 5y) — dw, —20(u + v) + 2w, 


E-(u+vy), 10u+v)+w 


.-(u+v)—w, ut+(v—w), C+ Ow, 


OE+u-v 


. 15v — 3w — Ou, 3w t+ l5v, D-u-+t 3C, 


8.5w — ll.lu + 0.4v 


. Resultant of forces. If the above vectors u, v, w 


represent forces in space, their sum is called their 
resultant. Calculate it. 


Equilibrium. By definition, forces are in equilibrium 
if their resultant is the zero vector. Find a force p such 
that the above u, v, w, and p are in equilibrium. 


General rules. Prove (3) and (4) for general 2 x 3 
matrices and scalars c and k. 
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20. TEAM PROJECT. Matrices for Networks. Matrices 


have various engineering applications, as we shall see. 
For instance, they can be used to characterize connections 
in electrical networks, in nets of roads, in production 
processes, etc., as follows. 


(a) Nodal Incidence Matrix. The network in Fig. 155 
consists of six branches (connections) and four nodes 
(points where two or more branches come together). 
One node is the reference node (grounded node, whose 
voltage is zero). We number the other nodes and 
number and direct the branches. This we do arbitrarily. 
The network can now be described by a matrix 
A = [aj], where 


+1 if branch k leaves node G) 
aj, = \ —1 if branch k enters node G) 
0 if branch k does not touch node G). 


A is called the nodal incidence matrix of the network. 
Show that for the network in Fig. 155 the matrix A has 
the given form. 


Branch 1 2 3 4 5 6 


Node @ 1 -1 -1 0 0 


Node @ 0 nf 0 


Node@®}] 0 0 1 O -1 -1 


Fig. 155. Network and nodal incidence 
matrix in Team Project 20(a) 


(b) Find the nodal incidence matrices of the networks 
in Fig. 156. 


Fig. 156. Electrical networks in Team Project 20(b) 


(c) Sketch the three networks corresponding to the 
nodal incidence matrices 


[it & @ di 2 <i & w 
=— tf & 2 let 7S) @ Oh 
| o -1 1 of | 0 0 1-1 0 


=1 1 0 1 0}. 


G.I} =I 0 ll 


(d) Mesh Incidence Matrix. A network can also be 
characterized by the mesh incidence matrix M = [mjx], 
where 


+1 if branch k is in mesh | j 


and has the same orientation 


Mj = \ —1Lif branch k is in mesh | j 


and has the opposite orientation 


0 if branch k is not in mesh | j 


and a mesh is a loop with no branch in its interior (or 
in its exterior). Here, the meshes are numbered and 
directed (oriented) in an arbitrary fashion. Show that 
for the network in Fig. 157, the matrix M has the given 
form, where Row 1 corresponds to mesh 1, etc. 


-l1 
0 


oO 
ll i) 

e 

phi 


0 
1 
0 
1 


K 
| 
HOOF 


0 0 


Fig. 157. Network and matrix M in 
Team Project 20(d) 
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7.2 Matrix Multiplication 


DEFINITION 


EXAMPLE 1 


Matrix multiplication means that one multiplies matrices by matrices. Its definition is 
standard but it looks artificial. Thus you have to study matrix multiplication carefully, 
multiply a few matrices together for practice until you can understand how to do it. Here 
then is the definition. (Motivation follows later.) 


Multiplication of a Matrix by a Matrix 


The product C = AB (in this order) of an m X n matrix A = [aj;,] times anr X p 
matrix B = [bj] is defined if and only if r =n and is then the m X p matrix 
C = [cj] with entries 


n j=ly---,m 
(1) ce = >) Gbue = Grbin + ajebo, + +++ + ainda ee 
1 = 1,---,p. 


The condition r = n means that the second factor, B, must have as many rows as the first 


factor has columns, namely n. A diagram of sizes that shows when matrix multiplication 
is possible is as follows: 


A B = C 
[m Xn] [n X p] =[mX p]. 


The entry c;;, in (1) is obtained by multiplying each entry in the jth row of A by the 
corresponding entry in the kth column of B and then adding these n products. For instance, 
C91 = do1by1 + dggbe, + +++ + daybnz, and so on. One calls this briefly a multiplication 
of rows into columns. For n = 3, this is illustrated by 


n=3 p=2 p=2 
A K aK 
la NO ay ~\ 
i, % 3 1 12 vi. “1 
, Gx Yon Cop 21 [22| =] S21 622 r 
m= m= 
Gay “Og9 Fag bs, bsp 31 32 
Gy, Up Up Ca San 


Notations in a product AB = C 


where we shaded the entries that contribute to the calculation of entry cg; just discussed. 
Matrix multiplication will be motivated by its use in linear transformations in this 
section and more fully in Sec. 7.9. 
Let us illustrate the main points of matrix multiplication by some examples. Note that 
matrix multiplication also includes multiplying a matrix by a vector, since, after all, 
a vector is a special matrix. 


Matrix Multiplication 


3 SSI. 2 3 1 22 =2 43 42 
AB=| 4 0 2))5 0 7 8|/=| 26 —16 14 6 
[2-6 =3 2}|9 —-4 1 1 —9 4 —-37 —28| 


Here cy, = 3: 2+ 5-5 + (—1)-: 9 = 22,and so on. The entry in the box iscag = 4°3 +0°74+2-1= 14. 
The product BA is not defined. B 
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EXAMPLE 3 


EXAMPLE 4 
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Multiplication of a Matrix and a Vector 
4 2)|3 4-3+2:°5 22 3/|4 2 
= whereas is undefined. |_| 
1 8 


5 1-3+8:-5 43 1 8 
Products of Row and Column Vectors 


1 1 3 6 1 
[3 6 1]}2]=[19] 2/[3 6 I]=| 6 12 2 G 
4| 4| 12 24 4| 


CAUTION! Matrix Multiplication Is Not Commutative, AB # BA in General 


This is illustrated by Examples | and 2, where one of the two products is not even defined, and by Example 3, 
where the two products have different sizes. But it also holds for square matrices. For instance, 


1 1} |-1 1 0 0 -1 1 1 1 99 99 
= but = . 
100 100 l =! 0 0 1 ~-1]{|100 100 —99 —99 


It is interesting that this also shows that AB = 0 does not necessarily imply BA = 0 or A = 0 or B = 0. We 
shall discuss this further in Sec. 7.8, along with reasons when this happens. | 


Our examples show that in matrix products the order of factors must always be observed 
very carefully. Otherwise matrix multiplication satisfies rules similar to those for numbers, 
namely. 


(a) (kKA)B = k(AB) = A(KB)_ written kAB or AkB 

(b) A(BC) = (AB)C written ABC 
(2) 

(c) (A+ B)C = AC + BC 


(d) C(A + B) = CA + CB 


provided A, B, and C are such that the expressions on the left are defined; here, k is any 
scalar. (2b) is called the associative law. (2c) and (2d) are called the distributive laws. 

Since matrix multiplication is a multiplication of rows into columns, we can write the 
defining formula (1) more compactly as 


(3) Ck = ajDy, j=10,m k= 1p; 


where a; is the jth row vector of A and b; is the Ath column vector of B, so that in 
agreement with (1), 


Dik 
ajb;, = [aj (a Gin | : = gjrik + djobox, tere + GjnPnk- 


Duk 
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Product in Terms of Row and Column Vectors 


If A = [aj,] is of size 3 X 3 and B = [bj] is of size 3 X 4, then 


| ayb, aybo aybs ayba | 
(4) AB =| aby agbe agb3  agbg |. 
Lasb; agbg agb3  agbz | 
Taking ay = [3 5 —I1],a,=[4 0 2], etc., verify (4) for the product in Example 1. ey) 


Parallel processing of products on the computer is facilitated by a variant of (3) for 
computing C = AB, which is used by standard algorithms (such as in Lapack). In this 
method, A is used as given, B is taken in terms of its column vectors, and the product is 
computed columnwise; thus, 

(5) AB = A[b,_ be b,| = [Ab, Abe Ab, |. 

Columns of B are then assigned to different processors (individually or several to 
each processor), which simultaneously compute the columns of the product matrix 


Abi, Abs, etc. 


Computing Products Columnwise by (5) 


To obtain 


4 1 3 itp 4 1|] 0 4 4 1}|7 34 
-5 2\|-1 -17) [-5 2}, 4} [sf [-s  2]he} [-23 
of AB and then write them as a single matrix, as shown in the first formula on the right. is 


Motivation of Multiplication 
by Linear Transformations 


Let us now motivate the “unnatural” matrix multiplication by its use in linear 
transformations. For n = 2 variables these transformations are of the form 


6*) Vy = 441%1 + AyoXQ 
Ye = dg1X1 + do2xe 


and suffice to explain the idea. (For general n they will be discussed in Sec. 7.9.) For 
instance, (6*) may relate an x;x2-coordinate system to a y,ye-coordinate system in the 
plane. In vectorial form we can write (6*) as 


yi a1 a11X1 + dy2X2 


y2 421 422} |*2 


(6) y= . 
dg1X1 + dgXe 


266 


EXAMPLE 7 


CHAP. 7 Linear Algebra: Matrices, Vectors, Determinants. Linear Systems 


Now suppose further that the x;x2-system is related to a wyw2-system by another linear 
transformation, say, 


x4 byt by2| | wi byw + by2we 
(7) x= = Bw = = ; 
Xe bo, bag | | We bow + bogwe 


Then the y,yo-system is related to the wyw2-system indirectly via the xx -system, and 
we wish to express this relation directly. Substitution will show that this direct relation is 
a linear transformation, too, say, 


C11 C12. | | W4 C11W1 + Cy2We 
(8) y = Cw= = . 
C21 Can} | We C2g1W1 + ConWe 


Indeed, substituting (7) into (6), we obtain 


Yq = 441(11W1 + Dy2W2) + ay2(bo1W1 + bogwe) 
= (a4y1b11 + ay2b21)W1 + (a11b12 + ay2b22)we 
Yo = do1(by1W1 + DygWwe) + dgo(beiw1 + beewWe) 


= (dgib11 + dggbe1)W1 + (de1b12 + degbe2)Wo. 
Comparing this with (8), we see that 
Cy = 441011 + ay2b01 Cy2 = a41D12 + Ay2be2 
C21 = d21by1 + degbe1 C22 = d2ib12 + deggbee. 


This proves that C = AB with the product defined as in (1). For larger matrix sizes the 
idea and result are exactly the same. Only the number of variables changes. We then have 
m variables y and n variables x and p variables w. The matrices A, B, and C = AB then 
have sizes m X n,n X p, and m X p, respectively. And the requirement that C be the 
product AB leads to formula (1) in its general form. This motivates matrix multiplication. 


Transposition 


We obtain the transpose of a matrix by writing its rows as columns (or equivalently its 
columns as rows). This also applies to the transpose of vectors. Thus, a row vector becomes 
a column vector and vice versa. In addition, for square matrices, we can also “reflect” 
the elements along the main diagonal, that is, interchange entries that are symmetrically 
positioned with respect to the main diagonal to obtain the transpose. Hence a;z becomes 
a21, 43, becomes aj3, and so forth. Example 7 illustrates these ideas. Also note that, if A 
is the given matrix, then we denote its transpose by A’. 


Transposition of Matrices and Vectors 
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DEFINITION 


A little more compactly, we can write 


5 4 3 O 7 3 8 1 
5 -8 1{t 
=|-8 oO], |8 -1 5] =/0 -1 —-9], 
4 0 0 
1 0 1 -9 4 7 5 4 


mis 


6 6 
[6 2 3]'=]2 Conversely, 2| =[6 2 3]. 
3 3 
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Transposition of Matrices and Vectors 


The transpose of an m Xn matrix A = [ajz] is the n X m matrix A’ (read A 
transpose) that has the first row of A as its first column, the second row of A as its 
second column, and so on. Thus the transpose of A in (2) is Al = [a,j], written out 


a1 d21 Am1 
= a2 422 Am2, 

(9) A’ = [any] 
An 42n amn 


AS a special case, transposition converts row vectors to column vectors and conversely. 


Transposition gives us a choice in that we can work either with the matrix or its 


transpose, whichever is more convenient. 
Rules for transposition are 


@ (AT =A 
T= AU ili 
(10) (b) (A +B) A +B 
(c) (cA)' = cAT 
(d) (AB)' = B'A’. 
CAUTION! Note that in (10d) the transposed matrices are in reversed order. We leave 


the proofs as an exercise in Probs. 9 and 10. 


Special Matrices 


Certain kinds of matrices will occur quite frequently in our work, and we now list the 


most important ones of them. 


Symmetric and Skew-Symmetric Matrices. 


Transposition gives rise to two useful 


classes of matrices. Symmetric matrices are square matrices whose transpose equals the 
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matrix itself. Skew-symmetric matrices are square matrices whose transpose equals 

minus the matrix. Both cases are defined in (11) and illustrated by Example 8. 

(11) AT=A = (thus axj = ax), A’ = —A (thus ay; = —ajy, hence aj = 0). 
Symmetric Matrix Skew-Symmetric Matrix 


Symmetric and Skew-Symmetric Matrices 


20 120 200 0 1 -3 
A =| 120 10 150 is symmetric, and B=/-1 0 -2 is skew-symmetric. 
| 200 150 30 | | 3 2 0| 


For instance, if a company has three building supply centers C,, Cz, C3, then A could show costs, say, aj for 
handling 1000 bags of cement at center C;, and aj, (j # k) the cost of shipping 1000 bags from C; to C,. Clearly, 
jk = yj if we assume shipping in the opposite direction will cost the same. 

Symmetric matrices have several general properties which make them important. This will be seen as we 
proceed. || 


Triangular Matrices. Upper triangular matrices are square matrices that can have nonzero 
entries only on and above the main diagonal, whereas any entry below the diagonal must be 
zero. Similarly, lower triangular matrices can have nonzero entries only on and below the 
main diagonal. Any entry on the main diagonal of a triangular matrix may be zero or not. 


Upper and Lower Triangular Matrices 


= & ss & 3 0 0 0 
1 4 2 2 0 0 
1 3 on 3 0 0 
, (Oo 3 2 8 -1 0}, a 
(0) 2 1 0) 2 0 
0 0 6 7 6 8 
7 > 1 9 3 6 
Upper triangular Lower triangular 


Diagonal Matrices. These are square matrices that can have nonzero entries only on 
the main diagonal. Any entry above or below the main diagonal must be zero. 

If all the diagonal entries of a diagonal matrix S are equal, say, c, we call S a scalar 
matrix because multiplication of any square matrix A of the same size by S has the same 
effect as the multiplication by a scalar, that is, 


(12) AS = SA = cA. 


In particular, a scalar matrix, whose entries on the main diagonal are all 1, is called a unit 
matrix (or identity matrix) and is denoted by I, or simply by I. For I, formula (12) becomes 


(13) AI = IA =A. 


Diagonal Matrix D. Scalar Matrix S. Unit Matrix I 


2 0 0 c 0 0 1 oO 0 
D=|0 -3 oO], S=|0 c« Ol, I=|0 1 O o 
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EXAMPLE 12 


EXAMPLE 13 


Some Applications of Matrix Multiplication 


Computer Production. Matrix Times Matrix 


Supercomp Ltd produces two computer models PC1086 and PC1186. The matrix A shows the cost per computer 
(in thousands of dollars) and B the production figures for the year 2010 (in multiples of 10,000 units.) Find a 
matrix C that shows the shareholders the cost per quarter (in millions of dollars) for raw material, labor, and 
miscellaneous. 


Quarter 
PC1086 PCI1186 12 3 4 
[1.2 1.6] Raw Components 
3 8 6 9} PC1086 
A =| 0.3 0.4 | Labor B= 
6 2 4 3] PC1186 
|.0.5 0.6 | Miscellaneous 


Solution. 


Quarter 
1 2. 3 4 


[13.2 128 136 15.6 | Raw Components 
C=AB=| 33 3.2 34 3.9] Labor 


| 5.1 5.2 5.4 6.3 | Miscellaneous 


Since cost is given in multiples of $1000 and production in multiples of 10,000 units, the entries of C are 
multiples of $10 millions; thus cy, = 13.2 means $132 million, etc. i 


Weight Watching. Matrix Times Vector 


Suppose that in a weight-watching program, a person of 185 1b burns 350 cal/hr in walking (3 mph), 500 in 
bicycling (13 mph), and 950 in jogging (5.5 mph). Bill, weighing 185 Ib, plans to exercise according to the 
matrix shown. Verify the calculations (W = Walking, B = Bicycling, J = Jogging). 


Ww B J 
MON | 1.0 0 05) _ 825 | MON 
350 
WED | 1.0 10 05 1325 | WED 
500 | = 
FRI 1.5 0 0.5 1000 | FRI 
950 
SAT 2.0 1.5 1.0 2400 | SAT ai 


Markov Process. Powers of a Matrix. Stochastic Matrix 


Suppose that the 2004 state of land use in a city of 60 mi? of built-up area is 
C: Commercially Used 25% I: Industrially Used 20% R: Residentially Used 55%. 


Find the states in 2009, 2014, and 2019, assuming that the transition probabilities for 5-year intervals are given 
by the matrix A and remain practically the same over the time considered. 


From C FromI FromR 


[0.7 O01 0 | Toc 
A=|02 09 02 Tol 


[0.1 0 08} ToR 
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A is a stochastic matrix, that is, a square matrix with all entries nonnegative and all column sums equal to 1. 
Our example concerns a Markov process,’ that is, a process for which the probability of entering a certain state 
depends only on the last state occupied (and the matrix A), not on any earlier state. 


Solution. From the matrix A and the 2004 state we can compute the 2009 state, 
ce [67-234 04-20+0 -55| for 61 © |[25] /195] 
I 0.2: 25+ 0.9-20+0.2-55}=]0.2 0.9 0.2 |} 20] =| 34.0}. 
R [01-25+ 0-20+0.8-55 0.1 #O 0.8} 55 46.5 


To explain: The 2009 figure for C equals 25% times the probability 0.7 that C goes into C, plus 20% times the 
probability 0.1 that I goes into C, plus 55% times the probability 0 that R goes into C. Together, 


25.-0.7 4 


19.5 [%]. 


25 -0.2 4 


Also 20 - 0.9 + 55+ 0.2 = 34[%]. 


Similarly, the new R is 46.5%. We see that the 2009 state vector is the column vector 


y=[19.5 34.0 46.5]7=Ax=A[25 20 55]" 


where the column vector x = [25 20 55]' is the given 2004 state vector. Note that the sum of the entries of 
y is 100 [ %]. Similarly, you may verify that for 2014 and 2019 we get the state vectors 


A’x 


z= Ay 


A(Ax) 


[17.05 43.80 39.15]" 


u= Az 


Answer. 


Ay 


A’x 


[16.315 50.660 33.025]". 


In 2009 the commercial area will be 19.5% (11.7 mi’), the industrial 34% (20.4 mi’), and the 


residential 46.5% (27.9 mi”). For 2014 the corresponding figures are 17.05%, 43.80%, and 39.15%. For 2019 
they are 16.315%, 50.660%, and 33.025%. (In Sec. 8.2 we shall see what happens in the limit, assuming that 


those probabilities remain the same. In the meantime, can you experiment or guess?) 


PROBLEM SET 7.2 


1-10 


GENERAL QUESTIONS 


1. 


Multiplication. Why is multiplication of matrices 
restricted by conditions on the factors? 


Square matrix. What form does a3 X 3 matrix have 
if it is symmetric as well as skew-symmetric? 


Product of vectors. Can every 3 X 3 matrix be 
represented by two vectors as in Example 3? 


Skew-symmetric matrix. How many different entries 
can a 4 X 4 skew-symmetric matrix have? Ann X n 
skew-symmetric matrix? 


Same questions as in Prob. 4 for symmetric matrices. 


Triangular matrix. If U,, Us are upper triangular and 
Ly, Lg are lower triangular, which of the following are 
triangular? 


U, + Us, UU, UZ, Up+Li, Ushi, 
Li + Le 


Idempotent matrix, defined by A? = A. Can you find 
four 2 X 2 idempotent matrices? 


8. 


10. 


Nilpotent matrix, defined by B” = 0 for some m. 
Can you find three 2 X 2 nilpotent matrices? 


. Transposition. Can you prove (10a)—(10c) for 3 x 3 


matrices? For m X n matrices? 


Transposition. (a) Illustrate (10d) by simple examples. 
(b) Prove (10d). 


11-20 | MULTIPLICATION, ADDITION, AND 


TRANSPOSITION OF MATRICES AND 


VECTORS 
Let 
[4—2 3 | ts © 
A=|-2 1 6|, B=/-3 1. O 
| 1 2 2 | 0 0 -2 
lo 1 | 3 
c=| 3 2], a=[1 -2 0], b=| 1 
|-2 0 | -1 


1ANDREI ANDREJEVITCH MARKOV (1856-1922), Russian mathematician, known for his work in 


probability theory. 
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Showing all intermediate results, calculate the following 
expressions or give reasons why they are undefined: 


11. AB, AB', BA, B‘A 

12. AA‘, A’, BB’, B? 

13. CC’, BC, CB, C'B 

14. 3A — 2B, (3A — 2B)', 3A — 2B, 
(3A — 2B)'a™ 

15. Aa, Aa’, (Ab)', b'A™ 

16. BC, BC", Bb, b'B 

17. ABC, ABa, ABb, Ca‘ 

18. ab, ba, aA, Bb 

19. 1.5a + 3.0b, 1.5a' + 3.0b, (A — B)b, Ab — Bb 

20. b'Ab, aBa’, aCC', C'ba 

21. General rules. Prove (2) for2 < 2 matrices A = [aj], 
B = [bj], C = [cx], and a general scalar. 

22. Product. Write AB in Prob. 11 in terms of row and 
column vectors. 

23. Product. Calculate AB in Prob. 11 columnwise. See 
Example 1. 


24, Commutativity. Find all 2 x 2 matrices A = [aj] 
that commute with B = [bj], where bj, = j + k. 

25. TEAM PROJECT. Symmetric and Skew-Symmetric 
Matrices. These matrices occur quite frequently in 
applications, so it is worthwhile to study some of their 
most important properties. 

(a) Verify the claims in (11) that a,; = aj, for a 
symmetric matrix, and ayj = —aj, for a skew- 
symmetric matrix. Give examples. 

(b) Show that for every square matrix C the matrix 
C + C' is symmetric and C — C" is skew-symmetric. 
Write C in the form C = § + T, where S is symmetric 
and T is skew-symmetric and find S and T in terms 
of C. Represent A and B in Probs. 11—20 in this form. 


(c) A linear combination of matrices A, B, C,---,M 
of the same size is an expression of the form 


(14) aA + bB + cC +--: + mM, 


where a,---, m are any scalars. Show that if these 
matrices are square and symmetric, so is (14); similarly, 
if they are skew-symmetric, so is (14). 

(d) Show that AB with symmetric A and B is symmetric 
if and only if A and B commute, that is, AB = BA. 


(e) Under what condition is the product of skew- 
symmetric matrices skew-symmetric? 


26-30 | FURTHER APPLICATIONS 


26. Production. In a production process, let N mean “no 
trouble” and T “trouble.” Let the transition probabilities 
from one day to the next be 0.8 for VN — N, hence 0.2 
for N — T, and 0.5 for T — N, hence 0.5 for T — T. 


27. 


28. 


29. 


30. 
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If today there is no trouble, what is the probability of 
N two days after today? Three days after today? 


CAS Experiment. Markov Process. Write a program 
for a Markov process. Use it to calculate further steps 
in Example 13 of the text. Experiment with other 
stochastic 3 X 3 matrices, also using different starting 
values. 


Concert subscription. In a community of 100,000 
adults, subscribers to a concert series tend to renew their 
subscription with probability 90% and persons presently 
not subscribing will subscribe for the next season with 
probability 0.2%. If the present number of subscribers 
is 1200, can one predict an increase, decrease, or no 
change over each of the next three seasons? 

Profit vector. Two factory outlets Fy and F2 in New 
York and Los Angeles sell sofas (S), chairs (C), and 
tables (T) with a profit of $35, $62, and $30, respectively. 
Let the sales in a certain week be given by the matrix 


Ss c¢ T 

400 60 240] F, 
A= 

100 120 500] Fy 


Introduce a “profit vector” p such that the components 
of v = Ap give the total profits of F, and Fo. 
TEAM PROJECT. Special Linear Transformations. 
Rotations have various applications. We show in this 
project how they can be handled by matrices. 

(a) Rotation in the plane. Show that the linear 
transformation y = Ax with 


cos@ —sin@ X4 yy 
> x= a ae 
sin 0 cos 0 Xo yo 
is a counterclockwise rotation of the Cartesian x4x9- 
coordinate system in the plane about the origin, where 


@ is the angle of rotation. 
(b) Rotation through 18. Show that in (a) 


A= 


AY = 


cosn@ —sin nd 
sin n@ cos n@ 


Is this plausible? Explain this in words. 


(c) Addition formulas for cosine and sine. By 
geometry we should have 


cosa -—sina||cosB -—sinB 
sin @ cos a@ || sin B cos B 
" (a +B) —sin(a + B) 


sin (a + B) cos (a + B) 
Derive from this the addition formulas (6) in App. A3.1. 
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(d) Computer graphics. To visualize a_ three- 
dimensional object with plane faces (e.g., a cube), we 
may store the position vectors of the vertices with 


(e) Rotations in space. Explain y = Ax geometrically 
when A is one of the three matrices 


respect to a suitable x1x9x3-coordinate system (and a / 0 0 
list of the connecting edges) and then obtain a two- 0 cos@ —siné |, 
dimensional image on a video screen by projecting 
the object onto a coordinate plane, for instance, onto [0 sin 0 cos 6 
the x4x9-plane by setting x3 = 0. To change the - : ae : 
ee? y nil ; a COS @ QO -sin@ cosy —sinw 0 
appearance of the image, we can impose a linear 
transformation on the position vectors stored. Show 0 1 0 , | sing cos of Ol. 
that a diagonal matrix D with main diagonal entries 3, 
1, 5 gives from an x = [x;] the new position vector [sin 0 COs 0 0 1 


y = Dx, where y, = 3x, (stretch in the x1-direction 
by a factor 3), yg = x2 (unchanged), y3 = 3x3 (con- 
traction in the x3-direction). What effect would a scalar 
matrix have? 


Gauss Elimination 


What effect would these transformations have in situations 
such as that described in (d)? 


We now come to one of the most important use of matrices, that is, using matrices to 
solve systems of linear equations. We showed informally, in Example | of Sec. 7.1, how 
to represent the information contained in a system of linear equations by a matrix, called 
the augmented matrix. This matrix will then be used in solving the linear system of 
equations. Our approach to solving linear systems is called the Gauss elimination method. 
Since this method is so fundamental to linear algebra, the student should be alert. 

A shorter term for systems of linear equations is just linear systems. Linear systems 
model many applications in engineering, economics, statistics, and many other areas. 
Electrical networks, traffic flow, and commodity markets may serve as specific examples 
of applications. 


Linear System, Coefficient Matrix, Augmented Matrix 


A linear system of m equations in n unknowns x1,---,x,, is a set of equations of 
the form 
Q44%1 aie = AnXn by 
a2{X1 ot ae agnXn bo 
(1) 
yon ate SO Xie Oi 


The system is called linear because each variable x; appears in the first power only, just 
as in the equation of a straight line. a11,---, dy are given numbers, called the coefficients 
of the system. bj,---, b,, on the right are also given numbers. If all the b; are zero, then 
(1) is called a homogeneous system. If at least one b; is not zero, then (1) is called a 
nonhomogeneous system. 
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A solution of (1) is a set of numbers x1,---,x, that satisfies all the m equations. 
A solution vector of (1) is a vector x whose components form a solution of (1). If the 
system (1) is homogeneous, it always has at least the trivial solution x; = 0,---,x, = 0. 


Matrix Form of the Linear System (1). From the definition of matrix multiplication 
we see that the m equations of (1) may be written as a single vector equation 


(2) Ax =b 


where the coefficient matrix A = [aj,] is the m X n matrix 


X41 
a1 a2 ain 
by 
a21 a22, aes don . 
A= », and x=]: and b=] . 
bm 
aml Am2 Amn 
Xn 


are column vectors. We assume that the coefficients aj, are not all zero, so that A is 
not a zero matrix. Note that x has n components, whereas b has m components. The 
matrix 


a1 7 ay | by 


am1 — Amn | Dn 


is called the augmented matrix of the system (1). The dashed vertical line could be 
omitted, as we shall do later. It is merely a reminder that the last column of A did not 
come from matrix A but came from vector b. Thus, we augmented the matrix A. 

Note that the augmented matrix A determines the system (1) completely because it 
contains all the given numbers appearing in (1). 


Geometric Interpretation. Existence and Uniqueness of Solutions 


If m = n = 2, we have two equations in two unknowns x1, x2 
41X1 + ayoxXe = by 
1X1 + dg2X2 = be. 


If we interpret x1, x2 as coordinates in the xx -plane, then each of the two equations represents a straight line, 
and (x1, Xg) is a solution if and only if the point P with coordinates x1, x2 lies on both lines. Hence there are 
three possible cases (see Fig. 158 on next page): 


(a) Precisely one solution if the lines intersect 
(b) Infinitely many solutions if the lines coincide 


(c) No solution if the lines are parallel 


N 


74 


% 


Unique solution 


Wh 


Infinitely 
many solutions 


Wh 


No solution 


Fig. 158. Three 
equations in 
three unknowns 
interpreted as 
planes in space 
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For instance, 


xX, +X, =1 xX, +X, =1 xX, +xX,=1 
2x1 -X=0 2x, + 2x, = 2 Xx, +2X%,=0 
Case (a) Case (b) Case (c) 

x2 x2 x2 


If the system is homogenous, Case (c) cannot happen, because then those two straight lines pass through the 
origin, whose coordinates (0, 0) constitute the trivial solution. Similarly, our present discussion can be extended 
from two equations in two unknowns to three equations in three unknowns. We give the geometric interpretation 
of three possible cases concerning solutions in Fig. 158. Instead of straight lines we have planes and the solution 
depends on the positioning of these planes in space relative to each other. The student may wish to come up 
with some specific examples. 3] 


Our simple example illustrated that a system (1) may have no solution. This leads to such 
questions as: Does a given system (1) have a solution? Under what conditions does it have 
precisely one solution? If it has more than one solution, how can we characterize the set 
of all solutions? We shall consider such questions in Sec. 7.5. 

First, however, let us discuss an important systematic method for solving linear systems. 


Gauss Elimination and Back Substitution 


The Gauss elimination method can be motivated as follows. Consider a linear system that 
is in triangular form (in full, upper triangular form) such as 


2x4 “F Sxe = 2 


13x29 = —26 


(Triangular means that all the nonzero entries of the corresponding coefficient matrix lie 
above the diagonal and form an upside-down 90° triangle.) Then we can solve the system 
by back substitution, that is, we solve the last equation for the variable, xx = —26/13 = —2, 
and then work backward, substituting x2 = —2 into the first equation and solving it for x4, 
obtaining x1 (2 5x9) 5(2 5 - (—2)) = 6. This gives us the idea of first reducing 
a general system to triangular form. For instance, let the given system be 


2x4 +35xg= 2 2 5 2 
Its augmented matrix is : 
4x1 t 3x9 = 30. —4 3 —30 


We leave the first equation as it is. We eliminate x, from the second equation, to get a 
triangular system. For this we add twice the first equation to the second, and we do the same 
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operation on the rows of the augmented matrix. This gives —4x, + 4x1 + 3xq + 10xg = 
—30 + 2: 2, that is, 


Ix, +5x,= 2 a Ss 8 


13x2 = —26 Row2+2Rowl1 |0 13. —26 


where Row 2 + 2 Row | means “Add twice Row | to Row 2” in the original matrix. This 
is the Gauss elimination (for 2 equations in 2 unknowns) giving the triangular form, from 
which back substitution now yields x2 = —2 and x, = 6, as before. 

Since a linear system is completely determined by its augmented matrix, Gauss 
elimination can be done by merely considering the matrices, as we have just indicated. 
We do this again in the next example, emphasizing the matrices by writing them first and 
the equations behind them, just as a help in order not to lose track. 


Gauss Elimination. Electrical Network 


Solve the linear system 


Xy- xXgt x= 0 
x14 xo x3 0 
10xg + 25x3 = 90 

20x, + 10x2 = 80. 


Derivation from the circuit in Fig. 159 (Optional). This is the system for the unknown currents 
X1 = 11, X2 = ig, X3 = ig in the electrical network in Fig. 159. To obtain it, we label the currents as shown, 
choosing directions arbitrarily; if a current will come out negative, this will simply mean that the current flows 
against the direction of our arrow. The current entering each battery will be the same as the current leaving it. 
The equations for the currents result from Kirchhoff’s laws: 


Kirchhoff’s Current Law (KCL). At any point of a circuit, the sum of the inflowing currents equals the sum 
of the outflowing currents. 


Kirchhoff’s Voltage Law (KVL). In any closed loop, the sum of all voltage drops equals the impressed 
electromotive force. 


Node P gives the first equation, node Q the second, the right loop the third, and the left loop the fourth, as 
indicated in the figure. 


Node P: ii- i,+ i,= 0 
Node Q: -i,+ i,- i,= 0 
Right loop: 107, + 251, = 90 
P 159 Left loop: 207, + 10i, = 80 


Fig. 159. Network in Example 2 and equations relating the currents 


Solution by Gauss Elimination. This system could be solved rather quickly by noticing its particular 
form. But this is not the point. The point is that the Gauss elimination is systematic and will work in general, 
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also for large systems. We apply it to our system and then do back substitution. As indicated, let us write the 
augmented matrix of the system first and then the system itself: 


Augmented Matrix A Equations 
Pivot 1 > =1 1! 0 Pivot 1 > By x x3= 0 
O- 11 1+ 
=i Lk = ib @ xXyJ+ x2 x3 0 
I 
Eliminate > Oo; 10 25 ! 90 Eliminate > 10xg + 25x3 = 90 
I 
20) 10 0 ! 80 20x34} + 10x = 80. 


Step 1. Elimination of x, 
Call the first row of A the pivot row and the first equation the pivot equation. Call the coefficient 1 of its 
x4-term the pivot in this step. Use this equation to eliminate x, (get rid of x1) in the other equations. For this, do: 


Add | times the pivot equation to the second equation. 
Add —20 times the pivot equation to the fourth equation. 


This corresponds to row operations on the augmented matrix as indicated in BLUE behind the new matrix in 
(3). So the operations are performed on the preceding matrix. The result is 


t =1 1 0 Xy- xXg+t xg= 0 
0 0 O04} O| Row2+Row1 o= 0 
_ 0 10 25 | 90 10x» + 25x3 = 90 
0 30 -20 | 80| Row4 —20Row 1 Aka — Dig = 80; 


Step 2. Elimination of x2 

The first equation remains as it is. We want the new second equation to serve as the next pivot equation. But 
since it has no x5-term (in fact, it is 0 = 0), we must first change the order of the equations and the corresponding 
rows of the new matrix. We put 0 = 0 at the end and move the third equation and the fourth equation one place 
up. This is called partial pivoting (as opposed to the rarely used total pivoting, in which the order of the 
unknowns is also changed). It gives 


1 =i 1! 0 XY xg 7 ig 0 

| 
Pivot 10 >/0 25! 90 Pivot 10 > (f0x2)+ 25x3 = 90 
Eliminate 30 > | 0 [30] —20 80 Eliminate 30x, > [30x / 20x3 = 80 
(en) 0! oO 0= 0 


To eliminate x9, do: 


Add —3 times the pivot equation to the third equation. 
The result is 


1 <= 1 | 0 Xp Xp t+ XZ= 0 
0 10 25 | 90 10xg + 25x3 = 90 
_ 0 0 —95 ! —190 Row 3 — 3 Row 2 — 95x3 = —190 
0 0 0 | 0 o= 0. 


Back Substitution. Determination of x3, x2, x1 (in this order) 
Working backward from the last to the first equation of this “triangular” system (4), we can now readily find 
x3, then x9, and then x1: 


— 95x3 = —190 x3 =i3 =2[A] 
10x29 + 25x3 = 90 x2 = 79(90 — 25x3) = ig = 4[A] 
x17 x27 30> 0 xy =X — x3 = iy = 2[A] 


where A stands for “amperes.” This is the answer to our problem. The solution is unique. (a) 
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THEOREM -1 


Elementary Row Operations. Row-Equivalent Systems 


Example 2 illustrates the operations of the Gauss elimination. These are the first two of 
three operations, which are called 


Elementary Row Operations for Matrices: 


Interchange of two rows 
Addition of a constant multiple of one row to another row 


Multiplication of a row by a nonzero constant c 


CAUTION! These operations are for rows, not for columns! They correspond to the 
following 


Elementary Operations for Equations: 


Interchange of two equations 
Addition of a constant multiple of one equation to another equation 


Multiplication of an equation by a nonzero constant c 


Clearly, the interchange of two equations does not alter the solution set. Neither does their 
addition because we can undo it by a corresponding subtraction. Similarly for their 
multiplication, which we can undo by multiplying the new equation by 1/c (since c # 0), 
producing the original equation. 

We now call a linear system S; row-equivalent to a linear system Sp if S; can be 
obtained from Sz by (finitely many!) row operations. This justifies Gauss elimination and 
establishes the following result. 


Row-Equivalent Systems 


Row-equivalent linear systems have the same set of solutions. 


Because of this theorem, systems having the same solution sets are often called 
equivalent systems. But note well that we are dealing with row operations. No column 
operations on the augmented matrix are permitted in this context because they would 
generally alter the solution set. 

A linear system (1) is called overdetermined if it has more equations than unknowns, 
as in Example 2, determined if m = n, as in Example 1, and underdetermined if it has 
fewer equations than unknowns. 

Furthermore, a system (1) is called consistent if it has at least one solution (thus, one 
solution or infinitely many solutions), but inconsistent if it has no solutions at all, as 
Xz, + xq = 1,x1 + x2 = 0 in Example 1, Case (c). 


Gauss Elimination: The Three Possible 
Cases of Systems 


We have seen, in Example 2, that Gauss elimination can solve linear systems that have a 
unique solution. This leaves us to apply Gauss elimination to a system with infinitely 
many solutions (in Example 3) and one with no solution (in Example 4). 
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Gauss Elimination if Infinitely Many Solutions Exist 


Solve the following linear system of three equations in four unknowns whose augmented matrix is 


[30 20 20 


-5.0 | 8.0] 2.0x9 + 2.0x3 — 5.0x4 = 8.0 
. 
(5) 0.6 1.5 15 —-54 | 2.7). Thus, 0.6x1 1.5x9 + 1.5x3 — 5.4xq4 = 2.7 
| 
[1.2 -0.3 —0.3 2.4 1! 2.14 1.2x4] — 0.3x2g — 0.3x3 + 2.4x4 = 2.1. 
Solution. As in the previous example, we circle pivots and box terms of equations and corresponding 


entries to be eliminated. We indicate the operations in terms of equations and operate on both equations and 
matrices. 


Step 1. Elimination of x, from the second and third equations by adding 


—0.6/3.0 = —0.2 times the first equation to the second equation, 


—1.2/3.0 = —0.4 times the first equation to the third equation. 


This gives the following, in which the pivot of the next step is circled. 


[3.0 20 20 -501!1 80] 3.0x, + 2.0x_ + 2.0x3 — 5.0x4 = 8.0 
| 
(6) | 0 1.1 11 —44 ! 1.1] Row 2 — 0.2 Row 1 + L.lxg - 44%, = 1.1 
| Bt bes 4a 
LO 1, sll 4.4 | —1.1] Row 3 — 0.4 Row 1 1.1 x9 l.lxg + 4.4x4 Teli, 


Step 2. Elimination of xz from the third equation of (6) by adding 


1.1/1.1 = 1 times the second equation to the third equation. 


This gives 
[3.0 20 20 -5.0 | 8.0] 3.0x1 + 2.0x5 + 2.0x3 — 5.0x4 = 8.0 
| 
(7) 0 ilo i -44 ! 1 Llxg + Llxg - 4.4x4 = 1.1 
| 
LO 0 0 0 ! 0 | Row3 + Row 2 0=0. 


Back Substitution. From the second equation, x2 = 1 — x3 + 4x4. From this and the first equation, 
xX, = 2 — xq. Since xg and xq remain arbitrary, we have infinitely many solutions. If we choose a value of x3 
and a value of x4, then the corresponding values of x; and xg are uniquely determined. 


On Notation. If unknowns remain arbitrary, it is also customary to denote them by other letters fy, fo, ---. 
In this example we may thus write xy = 2 — x4 = 2 — fo, xg = 1 — x3 + 4xq = 1 — fy + 4fo, x3 = ty (first 
arbitrary unknown), x4 = tg (second arbitrary unknown). 


Gauss Elimination if no Solution Exists 


What will happen if we apply the Gauss elimination to a linear system that has no solution? The answer is that 
in this case the method will show this fact by producing a contradiction. For instance, consider 


3 2 tk 3 + 2x9 + x3 =3 
. @+m+ 
2 1 1 ! O 2xyJ+ xot+ xg =0 
| 
| 6 2 4 | 6] 6x4|+ 2x2 + 4x3 = 6. 


Step 1. Elimination of x, from the second and third equations by adding 


-2 times the first equation to the second equation, 


-§ = —2 times the first equation to the third equation. 
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This gives 
[3 2 21 ! 3] 3x, + 2x2+ xg= 3 
0 -3 3 ! —2] Row 2 —2Row1 C$xy)+ xX3 = —2 
[0 -2 2 ! O} Row3—2Row1l [= 2xa]+ 2x3 = 0. 


Step 2. Elimination of xz from the third equation gives 


la 2 £-i Ss 3x, + 2x2+ xg= 3 
0 -4 2 | -2 — 3x2 + Bx3 = — 2 
LO 0 0 | 12 | Row 3 — 6 Row 2 = 12. 
The false statement 0 = 12 shows that the system has no solution. B 


Row Echelon Form and Information From It 


At the end of the Gauss elimination the form of the coefficient matrix, the augmented 
matrix, and the system itself are called the row echelon form. In it, rows of zeros, if 
present, are the last rows, and, in each nonzero row, the leftmost nonzero entry is farther 
to the right than in the previous row. For instance, in Example 4 the coefficient matrix 
and its augmented in row echelon form are 


a | - 27% 2 
(8) o -+ 3 and o —-b 2 | -2\]. 
| 
0 0 0 0 001 12 


Note that we do not require that the leftmost nonzero entries be | since this would have 
no theoretic or numeric advantage. (The so-called reduced echelon form, in which those 
entries are 1, will be discussed in Sec. 7.8.) 

The original system of m equations in n unknowns has augmented matrix [Alb]. This 
is to be row reduced to matrix [RIf ]. The two systems Ax = b and Rx = f are equivalent: 
if either one has a solution, so does the other, and the solutions are identical. 

At the end of the Gauss elimination (before the back substitution), the row echelon form 
of the augmented matrix will be 


Mnih 

"en ty 

(9) rm | 
Sirs 

, te 


Here, r = m, 1,1 # 0, and all entries in the blue triangle and blue rectangle are zero. 

The number of nonzero rows, r, in the row-reduced coefficient matrix R is called the 
rank of R and also the rank of A. Here is the method for determining whether Ax = b 
has solutions and what they are: 


(a) No solution. If ris less than m (meaning that R actually has at least one row of 
all Os) and at least one of the numbers /;-+1, f-+2,°**./m is not zero, then the system 
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Rx = f is inconsistent: No solution is possible. Therefore the system Ax = b is 
inconsistent as well. See Example 4, where r = 2 < m = 3 and f,.1 = fg = 12. 


If the system is consistent (either r = m, or r < mand all the numbers f,+1, f-+2,°°* fm 


are zero), then there are solutions. 


(b) Unique solution. If the system is consistent and r =n, there is exactly one 
solution, which can be found by back substitution. See Example 2, where r = n = 3 


and m = 4. 


(c) Infinitely many solutions. To obtain any of these solutions, choose values of 
Xr+1,''',Xn arbitrarily. Then solve the rth equation for x, (in terms of those 
arbitrary values), then the (r — 1)st equation for x, , and so on up the line. See 


Example 3. 


Orientation. Gauss elimination is reasonable in computing time and storage demand. 
We shall consider those aspects in Sec. 20.1 in the chapter on numeric linear algebra. 
Section 7.4 develops fundamental concepts of linear algebra such as linear independence 
and rank of a matrix. These in turn will be used in Sec. 7.5 to fully characterize the 
behavior of linear systems in terms of existence and uniqueness of solutions. 


PROBLEEM—SET 7-3 


1-14 


GAUSS ELIMINATION 


Solve the linear system given explicitly or by its augmented 
matrix. Show details. 


1 4x-6y=-11 2./30 -05 06 
—3x + 8y= 10 " 4.5 "| 
3 xt y- z= 9 4/4 #1 +0 4 
8y + 6z = -6 5 -3 i 2 
—2x + 4y — 6z = 40 ~9 2 -] 5 
5.[ 13 12 -6 6| 4 -8 3 16 
—4 7 -73 =f 2 <5 i 
| 11-13 157 | 3 -6 1 7 
Ps m» 7 ol & 4y +32 =8 
a ¢ 22 i = 
F . Z 3x + 2y =5 
9 —-2y-2z=-8 10[ 5 7 3 #17 
3x + 4y — 5z = 13 —15 21 -9 50 
H/o Ss 5 =10 © 
2-3 -3 6 2 
l4 1 1 -2 4 


12. 


13. 


14. 


15. 


| 
an 
ioe) 

| 
wo 
ies) 


3 4-7 2 =] 
Equivalence relation. By definition, an equivalence 
relation on a set is a relation satisfying three conditions: 
(named as indicated) 
(i) Each element A of the set is equivalent to itself 
(Reflexivity). 
(ii) If A is equivalent to B, then B is equivalent to A 
(Symmetry). 
(iii) If A is equivalent to B and B is equivalent to C, 
then A is equivalent to C (Transitivity). 
Show that row equivalence of matrices satisfies these 
three conditions. Hint. Show that for each of the three 
elementary row operations these conditions hold. 
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16. 


CAS PROJECT. Gauss Elimination and Back 
Substitution. Write a program for Gauss elimination 
and back substitution (a) that does not include pivoting 
and (b) that does include pivoting. Apply the programs 
to Probs. 11-14 and to some larger systems of your 
choice. 


17-21| MODELS OF NETWORKS 


In Probs. 17-19, using Kirchhoff’s laws (see Example 2) 
and showing the details, find the currents: 


17. 


18. 


16V 
: 
20 20 
1Q 
L 
4Q : I, 
= 
32 V 
4Q 120 8Q 
2vi pe 
I, 
I, I; 


20. 


21. 


Wheatstone bridge. Show that if R,/R3 = R1/Rz in 
the figure, then J = 0. (Ro is the resistance of the 
instrument by which / is measured.) This bridge is a 
method for determining R,. Rj, Ro, Rg are known. R3 
is variable. To get R,, make J] = 0 by varying R3. Then 
calculate Ry = R3R1/Ro. 


400 


800 


600 800 
1000 1200 
600 1000 


Wheatstone bridge 
Problem 20 


Net of one-way streets 


Problem 21 


Traffic flow. Methods of electrical circuit analysis 
have applications to other fields. For instance, applying 


22. 


23. 


24. 
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the analog of Kirchhoff’s Current Law, find the traffic 
flow (cars per hour) in the net of one-way streets (in 
the directions indicated by the arrows) shown in the 
figure. Is the solution unique? 


Models of markets. Determine the equilibrium 
solution (D, = S1, Dz = Sz) of the two-commodity 
market with linear model (D, S, P = demand, supply, 
price; index | = first commodity, index 2 = second 
commodity) 


D, = 40 — 2P, — Pa, S; = 4P, — Po + 4, 


Dz = 5P, — 2P2 + 16, So = 3Py — 4. 


Balancing a chemical equation x,C3Hg + x20. > 
x3CO, + x4H2O means finding integer x1, x9, *3, X4 
such that the numbers of atoms of carbon (C), hydrogen 
(H), and oxygen (O) are the same on both sides of this 
reaction, in which propane C3Hg and Og give carbon 
dioxide and water. Find the smallest positive integers 


X4,°°', X4. 


PROJECT. Elementary Matrices. The idea is that 
elementary operations can be accomplished by matrix 
multiplication. If A is an m X n matrix on which we 
want to do an elementary operation, then there is a 
matrix E such that EA is the new matrix after the 
operation. Such an E is called an elementary matrix. 
This idea can be helpful, for instance, in the design 
of algorithms. (Computationally, it is generally prefer- 
able to do row operations directly, rather than by 
multiplication by E.) 


(a) Show that the following are elementary matrices, 
for interchanging Rows 2 and 3, for adding —5 times 
the first row to the third, and for multiplying the fourth 
row by 8. 


100 0 
00 1 0 
EK, = > 
0 1 0 0 
| 0 0 0 1 
a) aye 
0 10 0 
E> = ; 
= ar a ee 
Lo 0 @ 
f 100 0 
0 10 0 
E3 = 
001 0 
000 8 
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Apply E,, Es, Eg to a vector and to a4 X 3 matrix of unit matrix. Prove that if M is obtained from A by an 
your choice. Find B = E3E9E,A, where A = [ajx] is elementary row operation, then 
the general 4 X 2 matrix. Is B equal to C = E,EE3A? M = EA, 
(b) Conclude that E,, Eo, E3 are obtained by doing where E is obtained from the n X n unit matrix 1, by 
the corresponding elementary operations on the 4 x 4 the same row operation. 


7.4 Linear Independence. Rank of a Matrix. 


Vector Space 


Since our next goal is to fully characterize the behavior of linear systems in terms 
of existence and uniqueness of solutions (Sec. 7.5), we have to introduce new 
fundamental linear algebraic concepts that will aid us in doing so. Foremost among 
these are linear independence and the rank of a matrix. Keep in mind that these 
concepts are intimately linked with the important Gauss elimination method and how 
it works. 


Linear Independence and Dependence of Vectors 


Given any set of m vectors a(4),°**, An) (with the same number of components), a linear 
combination of these vectors is an expression of the form 


c1aq) + cgay + +++ + Cmacm) 
where cj, C2,°**, Cm are any scalars. Now consider the equation 
(1) C184) ote c2a(2) ap ooo op Cmaam) = 0. 


Clearly, this vector equation (1) holds if we choose all c;’s zero, because then it becomes 
0 = 0. If this is the only m-tuple of scalars for which (1) holds, then our vectors 
Aq), °°", Ag) are said to form a linearly independent set or, more briefly, we call them 
linearly independent. Otherwise, if (1) also holds with scalars not all zero, we call these 
vectors linearly dependent. This means that we can express at least one of the vectors 
as a linear combination of the other vectors. For instance, if (1) holds with, say, 
c, # 0, we can solve (1) for ac): 


aq) = koa) Si Kmaun) where k; _ =Gj/ C1. 


(Some k,;’s may be zero. Or even all of them, namely, if aq) = 0.) 

Why is linear independence important? Well, if a set of vectors is linearly 
dependent, then we can get rid of at least one or perhaps more of the vectors until we 
get a linearly independent set. This set is then the smallest “truly essential” set with 
which we can work. Thus, we cannot express any of the vectors, of this set, linearly 
in terms of the others. 
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EXAMPLE=1 


DEFINITION 


EXAMPLE 2 


THEOREM 1 


Linear Independence and Dependence 
The three vectors 
ag =[ 3 0 2 2] 
ag =[-6 42 24 54] 
ag) =[21 —21 0 —15] 
are linearly dependent because 
6a) — 2A) — ag = 0. 


Although this is easily checked by vector arithmetic (do it!), it is not so easy to discover. However, a systematic 
method for finding out about linear independence and dependence follows below. 

The first two of the three vectors are linearly independent because cya) + c2a2) = 0 implies co = 0 (from 
the second components) and then cj = 0 (from any other component of ac). 3] 


Rank of a Matrix 


The rank of a matrix A is the maximum number of linearly independent row vectors 
of A. It is denoted by rank A. 


Our further discussion will show that the rank of a matrix is an important key concept for 
understanding general properties of matrices and linear systems of equations. 


Rank 
The matrix 
| 3 0 2 2] 
(2) A=|-6 42 24 54 
| 21 21 0 15 


has rank 2, because Example | shows that the first two row vectors are linearly independent, whereas all three 
row vectors are linearly dependent. 
Note further that rank A = 0 if and only if A = 0. This follows directly from the definition. fs] 


We call a matrix A, row-equivalent to a matrix Ayif A, can be obtained from Ag by 
(finitely many!) elementary row operations. 

Now the maximum number of linearly independent row vectors of a matrix does not 
change if we change the order of rows or multiply a row by a nonzero c or take a linear 
combination by adding a multiple of a row to another row. This shows that rank is 
invariant under elementary row operations: 


Row-Equivalent Matrices 


Row-equivalent matrices have the same rank. 


Hence we can determine the rank of a matrix by reducing the matrix to row-echelon 
form, as was done in Sec. 7.3. Once the matrix is in row-echelon form, we count the 
number of nonzero rows, which is precisely the rank of the matrix. 
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EXAMPLE 3 _ Determination of Rank 
For the matrix in Example 2 we obtain successively 
3 0 2 2 
A =| -6 42 24 54 (given) 
21 =—21 O° =15 
3 0 2 2 
0 42 28 58 | Row2 + 2 Row 1 
0 —21 —-14 —29 | Row3 —7Rowl 


0 0 0 0 | Row3 + 3Row2. 


The last matrix is in row-echelon form and has two nonzero rows. Hence rank A = 2, as before. fe 


Examples 1-3 illustrate the following useful theorem (with p = 3, n = 3, and the rank of 
the matrix = 2). 


THEOREM 2 Linear Independence and Dependence of Vectors 


Consider p vectors that each have n components. Then these vectors are linearly 
independent if the matrix formed, with these vectors as row vectors, has rank p. 
However, these vectors are linearly dependent if that matrix has rank less than p. 


Further important properties will result from the basic 


THEOREM 3 Rank in Terms of Column Vectors 


The rank r of a matrix A equals the maximum number of linearly independent 


column vectors of A. 
Hence A and its transpose A" have the same rank. 


PROOF In this proof we write simply “rows” and “columns” for row and column vectors. Let A 
be anm X n matrix of rank A = r. Then by definition of rank, A has r linearly independent 


rows which we denote by V4), °* +, Vy (regardless of their position in A), and all the rows 
Aq), °**, Aq) Of A are linear combinations of those, say, 

Aad = C11Vay + C12V2) + ++ + C1rVory 
(3) AQ) = C21Vay + Co2V2) + + + CarVery 


Aum) = CmiVa) + Cm2Veay + °° + CmrVoo- 


SEC. 7.4 Linear Independence. Rank of a Matrix. Vector Space 285 


EXAMPLE 4 


THEOREM 4 


PROOF 


These are vector equations for rows. To switch to columns, we write (3) in terms of 
components as n such systems, with k = 1,---,n, 


Ak = CyWik + CiQven +++ + CyWVrk 


doy = C2WV1~ + Coven + +++ + CoWre 


(4) 


Amk = CmWik + Cm2v2ek + °°* + CmrUrk 


and collect components in columns. Indeed, we can write (4) as 


ak C11 C12 Cir 
a2k C21 C29, Cor 
(5) . |=] . | tvae} . ft + UK 
amk Cm1 Cm2 Cmr 
where k = 1,-+-,n. Now the vector on the left is the Ath column vector of A. We see that 


each of these n columns is a linear combination of the same r columns on the right. Hence 
A cannot have more linearly independent columns than rows, whose number is rank A = r. 
Now rows of A are columns of the transpose A". For A! our conclusion is that A‘ cannot 
have more linearly independent columns than rows, so that A cannot have more linearly 
independent rows than columns. Together, the number of linearly independent columns 
of A must be r, the rank of A. This completes the proof. ia 


Illustration of Theorem 3 


The matrix in (2) has rank 2. From Example 3 we see that the first two row vectors are linearly independent 
and by “working backward” we can verify that Row 3 = 6 Row | — 3 Row 2. Similarly, the first two columns 
are linearly independent, and by reducing the last matrix in Example 3 by columns we find that 


Column 3 = 2 Column 1 + 2 Column 2 and Column 4 = 2 Column 1+ St Column 2. Ba 


Combining Theorems 2 and 3 we obtain 


Linear Dependence of Vectors 


Consider p vectors each having n components. If n < p, then these vectors are 
linearly dependent. 


The matrix A with those p vectors as row vectors has p rows and n < p columns; hence 
by Theorem 3 it has rank A = n < p, which implies linear dependence by Theorem 2. 


Vector Space 


The following related concepts are of general interest in linear algebra. In the present 
context they provide a clarification of essential properties of matrices and their role in 
connection with linear systems. 
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EXAMPLE 5 


THEOREM 5 


PROOF 


THEOREM 6 
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Consider a nonempty set V of vectors where each vector has the same number of 
components. If, for any two vectors a and b in V, we have that all their linear combinations 
aa + Bb (a, B any real numbers) are also elements of V, and if, furthermore, a and b satisfy 
the laws (3a), (3c), (3d), and (4) in Sec. 7.1, as well as any vectors a, b, c in V satisfy (3b) 
then V is a vector space. Note that here we wrote laws (3) and (4) of Sec. 7.1 in lowercase 
letters a, b, c, which is our notation for vectors. More on vector spaces in Sec. 7.9. 

The maximum number of linearly independent vectors in V is called the dimension of 
V and is denoted by dim V. Here we assume the dimension to be finite; infinite dimension 
will be defined in Sec. 7.9. 

A linearly independent set in V consisting of a maximum possible number of vectors 
in V is called a basis for V. In other words, any largest possible set of independent vectors 
in V forms basis for V. That means, if we add one or more vector to that set, the set will 
be linearly dependent. (See also the beginning of Sec. 7.4 on linear independence and 
dependence of vectors.) Thus, the number of vectors of a basis for V equals dim V. 

The set of all linear combinations of given vectors a(),° ++, Ag) With the same number 
of components is called the span of these vectors. Obviously, a span is a vector space. If 
in addition, the given vectors a), ° ++, Ap) are linearly independent, then they form a basis 
for that vector space. 

This then leads to another equivalent definition of basis. A set of vectors is a basis for 
a vector space V if (1) the vectors in the set are linearly independent, and if (2) any vector 
in V can be expressed as a linear combination of the vectors in the set. If (2) holds, we 
also say that the set of vectors spans the vector space V. 

By a subspace of a vector space V we mean a nonempty subset of V (including V itself) 
that forms a vector space with respect to the two algebraic operations (addition and scalar 
multiplication) defined for the vectors of V. 


Vector Space, Dimension, Basis 


The span of the three vectors in Example | is a vector space of dimension 2. A basis of this vector space consists 
of any two of those three vectors, for instance, a1), a2), OF @(4), Ag), etc. | 


We further note the simple 


Vector Space R” 


The vector space R" consisting of all vectors with n components (n real numbers) 
has dimension n. 


A basis of n vectors is aqy=[1 0 -:- OJ], ag =[0 1 0 =: OJ, =, 
an) = [0 eas 0 1}. |_| 


For a matrix A, we call the span of the row vectors the row space of A. Similarly, the 
span of the column vectors of A is called the column space of A. 

Now, Theorem 3 shows that a matrix A has as many linearly independent rows as 
columns. By the definition of dimension, their number is the dimension of the row space 
or the column space of A. This proves 


Row Space and Column Space 


The row space and the column space of a matrix A have the same dimension, equal 
to rank A. 


SEC. 7.4 Linear Independence. Rank of a Matrix. Vector Space 


287 


Finally, for a given matrix A the solution set of the homogeneous system Ax = 0 is a 
vector space, called the null space of A, and its dimension is called the nullity of A. In 
the next section we motivate and prove the basic relation 


(6) 


rank A + nullity A = Number of columns of A. 


PROBLEM SET 7.4 


1-10 


RANK, ROW SPACE, COLUMN SPACE 


Find the rank. Find a basis for the row space. Find a basis 
for the column space. Hint. Row-reduce the matrix and its 
transpose. (You may omit obvious factors from the vectors 


of these bases.) 
4 -2 6 a b 
1. 2 
= 2 1 =3 boa 
[o. 3° 4 [ 6 -4 0 
3. | 3 5 0 4,| —4 0 2; 
l5 0 10 | 0 2 6 
[0.2 -0.1 04 ro 1 0 
5. | 0 1.1 —0.3 6. | —1 QO —-4 
101 0 -21 Lo + @ 
a [2 4 8 16 
8 0 4 O 
16 8 4 2 
7/0 2 0 4 8 
4 8 16 2 
4 0 2 0 
7 I 2 16 8 4 
[o..0 £ © '; 2 ft OD 
0 0 1 0 =2 0 —4 1 
9, 10. 
1 1 1 1 1 -4 —-ll1 2 
[0 0 1 0 0 1 2 0) 
11. CAS Experiment. Rank. (a) Show experimentally 


that the n X n matrix A = [aj,] with aj, = j + k -— 1 
has rank 2 for any n. (Problem 20 shows n = 4.) Try 
to prove it. 

(b) Do the same when aj, = j + k + c, where c is any 
positive integer. 

(c) What is rank A if aj, = Qitk—29 Try to find other 
large matrices of low rank independent of n. 


12-16 


GENERAL PROPERTIES OF RANK 


Show the following: 


12. 
13. 


14. 


15. 


16. 


rank B'A' = rank AB. (Note the order!) 


rank A = rank B does not imply rank A? = rank B?. 
(Give a counterexample.) 


If A is not square, either the row vectors or the column 
vectors of A are linearly dependent. 


If the row vectors of a square matrix are linearly 
independent, so are the column vectors, and conversely. 


Give examples showing that the rank of a product of 
matrices cannot exceed the rank of either factor. 


17-25 | LINEAR INDEPENDENCE 


Are the following sets of vectors linearly independent? 
Show the details of your work. 


17. [ 
18. [ 


19. [ 
20. [ 


25. [ 


26. 


3 4 0 2], [2 -1 3 7], 

1 16 —12 —22] 

13 3 a) [3 3 4 5] [3 2 5 él, 

1. 4. 2 

45 6 7] 

0 1 1; [1 1 1], [0 0 1] 
12 3 4], [2 3 4 5], [3 4 5 6], 
[4 5 6 7] 
-(2 0 0 7], [2 0 0 8], [2 0 0 9], 
[2 0 1 0 

0.4 —-0.2 0.2], [0 0 O], [3.0 —0.6 1.5] 
9 8 7 6 5], [9 7 5 3 1] 

4 —-1 3], [0 8 1], [1 3 —S], 

2 6 1] 

6 0 -1 3], [2 2 5 0], 

[-4 -4 -4 —4] 

Linearly independent subset. Beginning with the 
last of the vectors [3 0 1 2], [6 1 0 Oj, 
[12 1 2 4], [6 0 2 4), and [9 0 1 2], 


omit one after another until you get a linearly 
independent set. 
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27-35| VECTOR SPACE 


Is the given set of vectors a vector space? Give reasons. If 
your answer is yes, determine the dimension and find a 
basis. (Uj, Va,--- denote components.) 


27. All vectors in R® with v1 — ve + 203 = 0 
28. All vectors in R? with 3v9 + v3 = k 
29. All vectors in R? with v1 = ve 


30. All vectors in R” with the first n — 2 components zero 


31 


32. 


33. 


. All vectors in R® with positive components 


404 q 


t 5SUg = 0 


All vectors in R? with 301 — v3 = 0,7 


All vectors in R? with 30, — 2v9 + v3 = 0, 


2v041 Wr 3v2 = 4u3 =0: 
34. All vectors in R” with lu;| =l1forj=1,---,n 
35. All vectors in R* with V1 = 20g = 303g = 4u4 


7.5 Solutions of Linear Systems: 
Existence, Uniqueness 


Rank, as just defined, gives complete information about existence, uniqueness, and general 


structure of the solution set of linear systems as follows. 


A linear system of equations in n unknowns has a unique solution if the coefficient 
matrix and the augmented matrix have the same rank n, and infinitely many solutions if 
that common rank is less than n. The system has no solution if those two matrices have 


different rank. 


To state this precisely and prove it, we shall use the generally important concept of a 
submatrix of A. By this we mean any matrix obtained from A by omitting some rows or 
columns (or both). By definition this includes A itself (as the matrix obtained by omitting 


no rows or columns); this is practical. 


THEOREM 1 


Fundamental Theorem for Linear Systems 


(a) Existence. A linear system of m equations in n unknowns x1,°** , Xn 


Q41X1 i a42X2 Gp geo Se AAnxXn — by 


(1) ag1X1 ar a29X92 ap 9o0 Se aanXn = bo 


is consistent, that is, has solutions, if and only if the coefficient matrix A and the 
augmented matrix A have the same rank. Here, 


a1 "'* Gn a1 "* Gin by 


I 

I 

I 

. eae . . eae . | 
& I 

A= and A= 
. wae . . wae . I 
I 

I 

nea ve lL % 

am1 amn am1 Amn | m 


(b) Uniqueness. The system (1) has precisely one solution if and only if this 
common rank r of A and A equals n. 
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(c) Infinitely many solutions. [f this common rank r is less than n, the system 
(1) has infinitely many solutions. All of these solutions are obtained by determining 
r suitable unknowns (whose submatrix of coefficients must have rank r) in terms of 
the remaining n — r unknowns, to which arbitrary values can be assigned. (See 
Example 3 in Sec. 7.3.) 


(d) Gauss elimination (Sec. 7.3). /f solutions exist, they can all be obtained by 
the Gauss elimination. (This method will automatically reveal whether or not 
solutions exist; see Sec. 7.3.) 


(a) We can write the system (1) in vector form Ax = b or in terms of column vectors 
Ca), °° * > Con) of A: 


(2) CayX1 + Ca@yxXg +7 + CqyXn = b. 


A is obtained by augmenting A by a single column b. Hence, by Theorem 3 in Sec. 7.4, 
rank A equals rank A or rank A + 1. Now if (1) has a solution x, then (2) shows that b 
must be a linear combination of those column vectors, so that A and A have the same 
maximum number of linearly independent column vectors and thus the same rank. 

Conversely, if rank A = rank A, then b must be a linear combination of the column 
vectors of A, say, 


(2*) b = aya) + +++ + AnC~r) 


since otherwise rank A = rank A + 1. But (2*) means that (1) has a solution, namely, 
X1 = Q4,°'*,Xy = My, as can be seen by comparing (2*) and (2). 

(b) If rank A = n, the n column vectors in (2) are linearly independent by Theorem 3 
in Sec. 7.4. We claim that then the representation (2) of b is unique because otherwise 


Cayx1 tore + Cay Xn = CayXy + + + Cay Xn: 
This would imply (take all terms to the left, with a minus sign) 
(x1 — Xa) + °° + An — Xn)ecy = O 


andx, — X, = 0,:-+,Xn, — X, = O by linear independence. But this means that the scalars 
X4,'°'',Xy in (2) are uniquely determined, that is, the solution of (1) is unique. 

(c) If rank A = rank A = r <r, then by Theorem 3 in Sec. 7.4 there is a linearly 
independent set K of r column vectors of A such that the other n — r column vectors of 
A are linear combinations of those vectors. We renumber the columns and unknowns, 
denoting the renumbered quantities by *, so that {€,1),-++, €qy} is that linearly independent 
set K. Then (2) becomes 


Cady Fe + Cade + CG pdtrer $01 + Cay kn = bd, 


Cy-+,°**, qm) are linear combinations of the vectors of K, and so are the vectors 
Xp t€ gts t's Xn€cn). Expressing these vectors in terms of the vectors of K and collect- 
ing terms, we can thus write the system in the form 


(3) Capyr tee + Cay = b 
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with y; = x; + B;, where B; results from the n — r terms €¢43)%,44,°°+, €qXp; here, 
j = 1,---,r. Since the system has a solution, there are y,---, y, satisfying (3). These 
scalars are unique since K is linearly independent. Choosing <,.,1,+--, %,, fixes the 6; and 
corresponding *; = y; — B;, where j = 1,---,r. 


(d) This was discussed in Sec. 7.3 and is restated here as a reminder. B 


The theorem is illustrated in Sec. 7.3. In Example 2 there is a unique solution since rank 
A = rank A = n = 3 (as can be seen from the last matrix in the example). In Example 3 
we have rank A = rank A = 2<n=4 and can choose x3 and x4 arbitrarily. In 
Example 4 there is no solution because rank A = 2 < rank A = 3. 


Homogeneous Linear System 


Recall from Sec. 7.3 that a linear system (1) is called homogeneous if all the b;’s are 
zero, and nonhomogeneous if one or several b;’s are not zero. For the homogeneous 
system we obtain from the Fundamental Theorem the following results. 


Homogeneous Linear System 


A homogeneous linear system 


a41%1 ar a42X92 =P 600 apr Anxkn = 0) 
a21X1 a a29X2 ap Oo ap aanxn — 0) 
(4) 
Grose, ap Cpe) ar 89° ar Obese, = (0) 
always has the trivial solution x; = 0,---, x, = 0. Nontrivial solutions exist if and 


only if rank A <n. If rank A = r <n, these solutions, together with x = 0, form a 
vector space (see Sec. 7.4) of dimension n — r called the solution space of (4). 

In particular, if X(4) and X(gy are solution vectors of (4), then X = c1Xqy + c2X@) 
with any scalars cy and cz is a solution vector of (4). (This does not hold for 
nonhomogeneous systems. Also, the term solution space is used for homogeneous 
systems only.) 


The first proposition can be seen directly from the system. It agrees with the fact that 
b = 0 implies that rank A = rank A, so that a homogeneous system is always consistent. 
If rank A = n, the trivial solution is the unique solution according to (b) in Theorem 1. 
If rank A < n, there are nontrivial solutions according to (c) in Theorem |. The solutions 
form a vector space because if x(q) and X,g) are any of them, then Ax) = 0, Ax) = 0, 
and this implies A(x(z) + Xay) = AX(y) + AX) = 0 as well as A(cxy) = cAX) = 0, 
where c is arbitrary. If rank A = r <n, Theorem | (c) implies that we can choose n — r 
suitable unknowns, call them x,+1,°++,X,, in an arbitrary fashion, and every solution is 
obtained in this way. Hence a basis for the solution space, briefly called a basis of 
solutions of (4), is yq),°**, ¥m—n, Where the basis vector y,;) is obtained by choosing 
Xy4j = 1 and the other x,44,+++, x, Zero; the corresponding first r components of this 
solution vector are then determined. Thus the solution space of (4) has dimension n — r. 
This proves Theorem 2. @ 


SEC. 7.6 For Reference: Second- and Third-Order Determinants 291 


THEOREM 3 


THEOREM 4 
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The solution space of (4) is also called the null space of A because Ax = 0 for every x in 
the solution space of (4). Its dimension is called the nullity of A. Hence Theorem 2 states that 


(5) rank A + nullity A =n 
where 7 is the number of unknowns (number of columns of A). 


Furthermore, by the definition of rank we have rank A S m in (4). Hence if m <n, 
then rank A < n. By Theorem 2 this gives the practically important 


Homogeneous Linear System with Fewer Equations Than Unknowns 


A homogeneous linear system with fewer equations than unknowns always has 
nontrivial solutions. 


Nonhomogeneous Linear Systems 


The characterization of all solutions of the linear system (1) is now quite simple, as follows. 


Nonhomogeneous Linear System 


If a nonhomogeneous linear system (1) is consistent, then all of its solutions are 
obtained as 


(6) X = Xo + Xp, 


where Xo is any (fixed) solution of (1) and Xp, runs through all the solutions of the 
corresponding homogeneous system (A). 


The difference x; = x — Xo of any two solutions of (1) is a solution of (4) because 
Ax, = A(x — Xo) = Ax — Axg = b — b = 0. Since x is any solution of (1), we get all 
the solutions of (1) if in (6) we take any solution xg of (1) and let x; vary throughout the 
solution space of (4). ia] 


This covers a main part of our discussion of characterizing the solutions of systems of 
linear equations. Our next main topic is determinants and their role in linear equations. 


7.6 For Reference: 
Second- and Third-Order Determinants 


We created this section as a quick general reference section on second- and third-order 
determinants. It is completely independent of the theory in Sec. 7.7 and suffices as a 
reference for many of our examples and problems. Since this section is for reference, go 
on to the next section, consulting this material only when needed. 

A determinant of second order is denoted and defined by 


41 412 


(1) D = det A = = a11422 — 412491. 


a21 422 


So here we have bars (whereas a matrix has brackets). 
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Cramer’s rule for solving linear systems of two equations in two unknowns 


(a) a44X1 + ayoXq = by 
(2) 
(b) dg1x1 + dgox2 = be 
is 
by aye 
bz dg2| — byde2 — ay2b2 
x, = — A 
D D 
(3) 
ay by 
da, be| — ay4bg — bya, 
xo >= = 
D D 
with D as in (1), provided 
D#O0. 


The value D = 0 appears for homogeneous systems with nontrivial solutions. 

We prove (3). To eliminate xz multiply (2a) by das and (2b) by —ajy, and add, 
(441422. — 442491)X1 = byd22 — ayabo. 

Similarly, to eliminate x; multiply (2a) by —ag, and (2b) by ay, and add, 
(441422 — 442421)X2 = a44b2 — byaa). 


Assuming that D = a11de2 — dj2d2, # 0, dividing, and writing the right sides of these 
two equations as determinants, we obtain (3). B 


Cramer’s Rule for Two Equations 


12 3 4 12 
4x1 + 3xg = 12 -8 5 84 2 -8 —56 
If then ay 6 xo 4. 3 
2x1 + 5x9 = —8 | 4 3 14 4 3 14 
2 5 2 5 
Third-Order Determinants 
A determinant of third order can be defined by 
41 42 413 
422 423 42 413 a2 413 
(4) D=}d21 de2 deg} = a1 — ag1 + a31 
a32 433 432 433 a22, 423 
431 432 d33 
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Note the following. The signs on the right are + — +. Each of the three terms on the 
right is an entry in the first column of D times its minor, that is, the second-order 
determinant obtained from D by deleting the row and column of that entry; thus, for a,, 
delete the first row and first column, and so on. 

If we write out the minors in (4), we obtain 


(4*) D = d11422433 — 411423432 + 421413432 — 421412433 + 431412423 — 431413429. 
Cramer’s Rule for Linear Systems of Three Equations 

1X1 + AyoXg + A43x3 = dy 
(5) g1X1 + AgeXe + d23x3 = be 


a31X1 + dg3oX2 + 433X3 = bg 


1S 


dD, Do D3 
(6) 1 => %2= Hs XB =D (D # 0) 
with the determinant D of the system given by (4) and 
by a2 443 41 b a3 41 A by 


Note that D;, Dz, D3 are obtained by replacing Columns 1, 2, 3, respectively, by the 
column of the right sides of (5). 

Cramer’s rule (6) can be derived by eliminations similar to those for (3), but it also 
follows from the general case (Theorem 4) in the next section. 


/./ Determinants. Cramer’s Rule 


Determinants were originally introduced for solving linear systems. Although impractical 
in computations, they have important engineering applications in eigenvalue problems 
(Sec. 8.1), differential equations, vector algebra (Sec. 9.3), and in other areas. They can 
be introduced in several equivalent ways. Our definition is particularly for dealing with 
linear systems. 

A determinant of order n is a scalar associated with ann X n (hence square!) matrix 
A = [aj], and is denoted by 


41 412 - Ain 


d21 422 aa don 


(1) D = detA = 


Gni 4an2 Pag ann 
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For n = 1, this determinant is defined by 


(2) D = a). 

For n = 2 by 

(3a) D = aj Cj + aj2Cjg. + +++ + AinCin VG = 1,2,::-, orn) 
or 

(3b) D = ayCy, + dox~Co, + +°: + Ane Cye (kK = 1,2,°°+, orn). 
Here, 


Ci = (— DI Mx 


and M,y, is a determinant of order n — 1, namely, the determinant of the submatrix of A 
obtained from A by omitting the row and column of the entry a;y,, that is, the jth row and 
the kth column. 

In this way, D is defined in terms of n determinants of order n — 1, each of which is, 
in turn, defined in terms of n — | determinants of order n — 2, and so on—until we 
finally arrive at second-order determinants, in which those submatrices consist of single 
entries whose determinant is defined to be the entry itself. 

From the definition it follows that we may expand D by any row or column, that is, choose 
in (3) the entries in any row or column, similarly when expanding the C;j,’s in (3), and so on. 

This definition is unambiguous, that is, it yields the same value for D no matter which 
columns or rows we choose in expanding. A proof is given in App. 4. 

Terms used in connection with determinants are taken from matrices. In D we have n” 
entries ;,, also n rows and n columns, and a main diagonal on which a4, dg2,°+*, dnn 
stand. Two terms are new: 


Mjx, is called the minor of a;;, in D, and Cj, the cofactor of aj, in D. 
For later use we note that (3) may also be written in terms of minors 


n 

(4a) D= S(-1)? **ajeMjx (j = 1,2,-++, orn) 
k=1 
n « 

(4b) D= DS (H1 aM jx (k = 1,2,-::, orn). 
j=l 


Minors and Cofactors of a Third-Order Determinant 


In (4) of the previous section the minors and cofactors of the entries in the first column can be seen directly. 
For the entries in the second row the minors are 


442 443 411 413 441 412 
M21 = ; M22 = : M23 = 
432 433 431 = 433 431 = 432 
and the cofactors are Co = —Mo1, Coo = +Moape, and Co3 = —Mo3. Similarly for the third row—write these 


down yourself. And verify that the signs in Cj, form a checkerboard pattern 


+ = 4 
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EXAMPLE 2_ Expansions of a Third-Order Determinant 


1 3 0 
6 4 2 + 2 6 
D=| 2 6 4;=1 =3 +0 
0 2 =1 2 =1 0 
= 0 2 


1012 — 0) — 344 + 4) + 00 + 6) 12. 


This is the expansion by the first row. The expansion by the third column is 


2 6 1 3 1 3 
D=0 -—4 ed =0-12+0=-12. 
=i 0 | 0 2 6 
Verify that the other four expansions also give the value —12. B 
EXAMPLE 3 _ Determinant of a Triangular Matrix 
=3 0 0 
4 0 
6 4 0} = -3 =-3-4-5=-60 
2 5 
-1 2 5 


Inspired by this, can you formulate a little theorem on determinants of triangular matrices? Of diagonal 
matrices? ia 


General Properties of Determinants 


There is an attractive way of finding determinants (1) that consists of applying elementary 
row operations to (1). By doing so we obtain an “upper triangular” determinant (see 
Sec. 7.1, for definition with “matrix” replaced by “determinant’”’) whose value is then very 
easy to compute, being just the product of its diagonal entries. This approach is similar 
(but not the same!) to what we did to matrices in Sec. 7.3. In particular, be aware that 
interchanging two rows in a determinant introduces a multiplicative factor of —1 to the 
value of the determinant! Details are as follows. 


THEOREM 1 Behavior of an nth-Order Determinant under Elementary Row Operations 


(a) Interchange of two rows multiplies the value of the determinant by —1. 


(b) Addition of a multiple of a row to another row does not alter the value of the 
determinant. 


(c) Multiplication of a row by a nonzero constant c multiplies the value of the 
determinant by c. (This holds also when c = 0, but no longer gives an elementary 
row operation.) 


PROOF (a) By induction. The statement holds for n = 2 because 


a b 


= ad — be, but = be — ad. 


Cc d 
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We now make the induction hypothesis that (a) holds for determinants of ordern — | 2 2 
and show that it then holds for determinants of order n. Let D be of order n. Let E be 
obtained from D by the interchange of two rows. Expand D and E by a row that is not 
one of those interchanged, call it the jth row. Then by (4a), 


nr nN 
(5) p= 2 KdyMjn, = B= DT NaN 


where N;;, is obtained from the minor Mj, of aj, in D by the interchange of those two 
rows which have been interchanged in D (and which Nj, must both contain because we 
expand by another row!). Now these minors are of order n — 1. Hence the induction 
hypothesis applies and gives Nj, = —Mj,. Thus E = —D by (5). 

(b) Add c times Row i to Row j. Let D be the new determinant. Its entries in Row j 
are dj, + CAjx. If we expand D by this Row j, we see that we can write it as 
D = D, + cDg, where D, = D has in Row j the aj, whereas Dy has in that Row j the 
aj, from the addition. Hence Dg has aj, in both Row i and Row j. Interchanging these 
two rows gives Ds back, but on the other hand it gives —Dog by (a). Together 
Dz = —Dz = 0, so that D = Dy = D. 

(c) Expand the determinant by the row that has been multiplied. 


CAUTION! det (cA) = c” det A (not c det A). Explain why. fe 


Evaluation of Determinants by Reduction to Triangular Form 


Because of Theorem | we may evaluate determinants by reduction to triangular form, as in the Gauss elimination 
for a matrix. For instance (with the blue explanations always referring to the preceding determinant) 


2 0 —4 6 
+ 5 1 0 
0 2 6 6-1 


Oo -4 6 

5 Oo =12 Row 2 — 2 Row | 
7 2 6 —1 

8 3 10 Row 4 + 1.5 Row 1 

Oo —-4 6 

5 9 ~=12 


0 2.4 3.8 Row 3 — 0.4 Row 2 
0 -11.4 29.2 Row 4 — 1.6 Row 2 
0 —4 6 


cooUmUmUmCOmUmUWNINMLCUCOUCOCOCOCClUWNMDLUCOOOCOTCDCOC RN 


0 —0 47.25| Row 4 + 4.75 Row 3 
=2-5-24- 47.25 = 1134. Hi 
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Further Properties of nth-Order Determinants 


(a)-(c) in Theorem 1 hold also for columns. 
(d) Transposition leaves the value of a determinant unaltered. 
(e) A zero row or column renders the value of a determinant zero. 


(f) Proportional rows or columns render the value of a determinant zero. In 
particular, a determinant with two identical rows or columns has the value zero. 


(a)-(e) follow directly from the fact that a determinant can be expanded by any row 
column. In (d), transposition is defined as for matrices, that is, the jth row becomes the 
jth column of the transpose. 

(f) If Row j = c times Row i, then D = cD,, where D, has Row j = Row i. Hence 
an interchange of these rows reproduces Dy, but it also gives —D, by Theorem 1(a). 
Hence D,; = 0 and D = cD, = 0. Similarly for columns. ia 


It is quite remarkable that the important concept of the rank of a matrix A, which is the 
maximum number of linearly independent row or column vectors of A (see Sec. 7.4), can 
be related to determinants. Here we may assume that rank A > 0 because the only matrices 
with rank O are the zero matrices (see Sec. 7.4). 


Rank in Terms of Determinants 


Consider an m X n matrix A = [ajx]: 


(1) A has rank r 2 1 if and only if A has an r X r submatrix with a nonzero 
determinant. 


(2) The determinant of any square submatrix with more than r rows, contained 
in A (if such a matrix exists!) has a value equal to zero. 


Furthermore, ifm = n, we have: 


(3) Ann X n square matrix A has rank n if and only if 


det A # 0. 


The key idea is that elementary row operations (Sec. 7.3) alter neither rank (by Theorem 
1 in Sec. 7.4) nor the property of a determinant being nonzero (by Theorem | in this 
section). The echelon form A ofA (see Sec. 7.3) has r nonzero row vectors (which are 
the first r row vectors) if and only if rank A = r. Without loss of generality, we can 
assume that r = 1. Let R be the r X r submatrix in the left upper corner of A (so that 
the entries of R are in both the first r rows and r columns of A). Now R is triangular, 
with all diagonal entries r;; nonzero. Thus, det R= r11°'' Try # 0. Also det R # 0 for 
the corresponding r X r submatrix R of A because R results from R by elementary row 
operations. This proves part (1). 

Similarly, detS = 0 for any square submatrix S of r+ 1 or more rows perhaps 
contained in A because the corresponding submatrix S$ of A must contain a row of zeros 
(otherwise we would have rank A = r + 1), so that det $ = 0 by Theorem 2. This proves 
part (2). Furthermore, we have proven the theorem for an m X n matrix. 
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For ann X n square matrix A we proceed as follows. To prove (3), we apply part (1) 
(already proven!). This gives us that rank A = n = 1 if and only if A contains ann X n 
submatrix with nonzero determinant. But the only such submatrix contained in our square 
matrix A, is A itself, hence det A # 0. This proves part (3). | 


Cramer’s Rule 


Theorem 3 opens the way to the classical solution formula for linear systems known as 
Cramer’s rule,” which gives solutions as quotients of determinants. Cramer’s rule is not 
practical in computations for which the methods in Secs. 7.3 and 20.1—20.3 are suitable. 
However, Cramer’s rule is of theoretical interest in differential equations (Secs. 2.10 and 
3.3) and in other theoretical work that has engineering applications. 


Cramer’s Theorem (Solution of Linear Systems by Determinants) 


(a) [fa linear system of n equations in the same number of unknowns x4,°++, Xx», 
444X1 + ayoxXq + +++ + AyynXyn = D1 
ag1X1 ar a22X 92 ap e0o cp agnXn = bo 
(6) 
Goshei ar Gyo, TP Pe? AP Waal = Dp 


has a nonzero coefficient determinant D = det A, the system has precisely one 
solution. This solution is given by the formulas 


D, Dz Dy 
ae = 


(7) w= (Cramer’s rule) 


where Dy, is the determinant obtained from D by replacing in D the kth column by 
the column with the entries by,--+, by. 


(b) Hence if the system (6) is homogeneous and D # 0, it has only the trivial 
solution x1 = 0, x29 = 0,°°+, x, = 0. If D = 0, the homogeneous system also has 
nontrivial solutions. 


The augmented matrix A of the system (6) is of size n X (n + 1). Hence its rank can be 
at most n. Now if 


a1 ‘" Mn 
(8) D = detA = +0, 


an1 ~s ann 


2GABRIEL CRAMER (1704-1752), Swiss mathematician. 


SEC. 7.7. Determinants. Cramer’s Rule 299 


EXAMPLE 5 


then rank A =n by Theorem 3. Thus rank A = rank A. Hence, by the Fundamental 
Theorem in Sec. 7.5, the system (6) has a unique solution. 
Let us now prove (7). Expanding D by its kth column, we obtain 


(9) D = ayCi, + do~Cop + +°* + AnkCnk: 


where Cj, is the cofactor of entry a;;, in D. If we replace the entries in the kth column of 
D by any other numbers, we obtain a new determinant, say, D. Clearly, its expansion by 
the kth column will be of the form (9), with ay;,,°-+, An_% replaced by those new numbers 
and the cofactors C;;, as before. In particular, if we choose as new numbers the entries 
441,°**, Gy Of the /th column of D (where / # k), we have a new determinant D which 
has the column [a ,  --- can twice, once as its /th column, and once as its kth because 
of the replacement. Hence D=0 by Theorem 2(f). If we now expand D by the column 
that has been replaced (the Ath column), we thus obtain 


(10) ay1Cik + agiCoK a AniCnk = 0 ¢ # k). 


We now multiply the first equation in (6) by Cy; on both sides, the second by Coz, °--, 
the last by C,,;,, and add the resulting equations. This gives 


Cie. Ft + AynXn) ++ F Cag(Gnix1 + ++ + AnnXn) 


= biCy, + +++ + by Chk. 


(1) 


Collecting terms with the same x;, we can write the left side as 
X1(d41Cy~ + A21CoK ++°* + GniCnk) + +++ + Xn(GinCik + denCor + +++ + GnnCnk). 
From this we see that x; is multiplied by 
AKCik + doa~Con + °°* + adnkCnk- 
Equation (9) shows that this equals D. Similarly, x1 is multiplied by 
aC, + dagyCon + +++ + AniCnx. 


Equation (10) shows that this is zero when / # k. Accordingly, the left side of (11) equals 
simply x;D, so that (11) becomes 


X_-D = byCyy + boCo, + +++ + byCnr. 


Now the right side of this is D;, as defined in the theorem, expanded by its kth column, 
so that division by D gives (7). This proves Cramer’s rule. 

If (6) is homogeneous and D # 0, then each D;, has a column of zeros, so that D., = 0 
by Theorem 2(e), and (7) gives the trivial solution. 

Finally, if (6) is homogeneous and D = 0, then rank A <n by Theorem 3, so that 
nontrivial solutions exist by Theorem 2 in Sec. 7.5. a 


Illustration of Cramer’s Rule (Theorem 4) 


For n = 2, see Example | of Sec. 7.6. Also, at the end of that section, we give Cramer’s rule for a general 
linear system of three equations. 
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Finally, an important application for Cramer’s rule dealing with inverse matrices will 


be given in the next section. 


PROBLEM SET 7-7 


GENERAL PROBLEMS 


1. 


. Second-Order Determinant. 


General Properties of Determinants. Illustrate each 
statement in Theorems | and 2 with an example of 
your choice. 


Expand a_ general 
second-order determinant in four possible ways and 
show that the results agree. 


. Third-Order Determinant. Do the task indicated in 


Theorem 2. Also evaluate D by reduction to triangular 
form. 


. Expansion Numerically Impractical. Show that the 


computation of an nth-order determinant by expansion 
involves n! multiplications, which if a multiplication 
takes 107° sec would take these times: 


n 10 15 20 25 
0.004 22 77 0.5 - 10° 
Time : 
sec min year Ss year Ss 


. Multiplication by Scalar. Show that det (kA) = 


k” det A (not k det A). Give an example. 


. Minors, cofactors. Complete the list in Example 1. 


7-15 


11. 


13. 


EVALUATION OF DETERMINANTS 
Showing the details, evaluate: 
cosa sina 0.4 4.9 
8. 
sinB cos B 1S =13 
cosn@ sinn@ cosht  sinht 
10. 
—sinn@ cosné sinht cosht 
4 -1 8 a b Cc 
0) 2 3 12. Ic a b 
0 0 5 b é a 
0 4 -1 5 4 7 0 0) 
—4 0 3-2 2 8 0 0 
14, 
1 —-3 0 1 0 0 1 5 
—5 2 =] 0 0 QO -2 2 


15. 


16. 


1 2 0 O 
2 4 2 0O 
0 9 2 


0 O 2 16 


CAS EXPERIMENT. Determinant of Zeros and 
Ones. Find the value of the determinant of the n X n 
matrix A, with main diagonal entries all 0 and all 
others 1. Try to find a formula for this. Try to prove it 
by induction. Interpret Ag and Ag as incidence matrices 
(as in Problem Set 7.1 but without the minuses) of a 
triangle and a tetrahedron, respectively; similarly for an 
n-simplex, having n vertices and n(n — 1)/2 edges (and 
spanning R"~1, n = 5, 6,---). 


17-19 


RANK BY DETERMINANTS 


Find the rank by Theorem 3 (which is not very practical) 
and check by row reduction. Show details. 


17. 


19. 


20. 


4. 9 1. 0 4 -6 
ae “a5 18.| 4 O 10 
| 16 12 |-6 10 0 


1 3 2 6 


14 0 8 48 


TEAM PROJECT. Geometric Applications: Curves 
and Surfaces Through Given Points. The idea is to 
get an equation from the vanishing of the determinant 
of a homogeneous linear system as the condition for a 
nontrivial solution in Cramer’s theorem. We explain 
the trick for obtaining such a system for the case of 
a line L through two given points P;: (x1, y1) and 
Py: (X2, yo). The unknown line is ax + by = —c, 
say. We write it as ax + by +c-1=0. To geta 
nontrivial solution a, b, c, the determinant of the 
“coefficients” x, y, 1 must be zero. The system is 


ax + by c:1=0 (Line L) 
(12) ax, + byy +c:1=0 (RyonL) 
axg + byg+c-1=0 (PonL). 
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(a) Line through two points. Derive from D = 0 in 21-25| CRAMER’S RULE 


(12) the familiar formula 


Solve by Cramer’s rule. Check by Gauss elimination and 


XX y-y1 back substitution. Show details. 


a1 he Yi D8, 21. 3x — 5y = 15.5 22. 2x — 4y = -24 


(b) Plane. Find the analog of (12) for a plane through 


three given points. Apply it when the points are 6x + l6y= 5.0 aa Hy 

(1, 1, 1), G, 2, 6), G, 0, 5). 23. By-4¢= 16 2 3x-2yt+ z= 13 
(c) Circle. Find a similar formula for a circle in the 

plane through three given points. Find and sketch the 2x — Sy + 1z= —-27 ax+ y+4z= 11 
circle through (2, 6), (6, 4), (7, 1). 

(d) Sphere. Find the analog of the formula in (c) for a — 92 = 9 x + dy — 32 = —31 
a sphere through four given points. Find the sphere 25. -4dw t+ x+ y = -10 

through (0, 0, 5), (4, 0, 1), (0, 4, 1), (0,0, —3) by this : 

formula or by inspection. w — 4x + Z= 1 

(e) General conic section. Find a formula for a a iget pe He 

general conic section (the vanishing of a determinant 

of 6th order). Try it out for a quadratic parabola and xt+ y-4z= 10 


for a more general conic section of your own choice. 


7.8 Inverse of a Matrix. 
Gauss—Jordan Elimination 


THEOREM -1 


In this section we consider square matrices exclusively. 
The inverse of ann X n matrix A = [ajz] is denoted by AW’ and is ann X n matrix 
such that 


(1) AA 1 =A 'A=I 


where I is the n X n unit matrix (see Sec. 7.2). 

If A has an inverse, then A is called a nonsingular matrix. If A has no inverse, then 
A is called a singular matrix. 

If A has an inverse, the inverse is unique. 

Indeed, if both B and C are inverses of A, then AB = I and CA = J, so that we obtain 
the uniqueness from 


B = IB = (CA)B = C(AB) = CI=C. 


We prove next that A has an inverse (is nonsingular) if and only if it has maximum 
possible rank n. The proof will also show that Ax = b implies x = A’'b provided A? 
exists, and will thus give a motivation for the inverse as well as a relation to linear systems. 
(But this will not give a good method of solving Ax = b numerically because the Gauss 
elimination in Sec. 7.3 requires fewer computations.) 


Existence of the Inverse 


The inverse A~* of ann X n matrix A exists if and only if rank A = n, thus (by 
Theorem 3, Sec. 7.7) ifand only if det A # 0. Hence A is nonsingular if rank A = n, 
and is singular if rank A <n. 
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Let A be a given n X n matrix and consider the linear system 
(2) Ax = b. 


If the inverse Av? exists, then multiplication from the left on both sides and use of (1) 
gives 


A tAx = x = Av'b. 


This shows that (2) has a solution x, which is unique because, for another solution u, we 
have Au = b, so that u = A’ +b = x. Hence A must have rank n by the Fundamental 
Theorem in Sec. 7.5. 

Conversely, let rank A = n. Then by the same theorem, the system (2) has a unique 
solution x for any b. Now the back substitution following the Gauss elimination (Sec. 7.3) 
shows that the components x; of x are linear combinations of those of b. Hence we can 
write 


(3) x = Bb 
with B to be determined. Substitution into (2) gives 
Ax = A(Bb) = (AB)b = Cb = b (C = AB) 
for any b. Hence C = AB = [, the unit matrix. Similarly, if we substitute (2) into (3) we get 
x = Bb = B(Ax) = (BA)x 


for any x (and b = Ax). Hence BA = I. Together, B = A™? exists. H 


Determination of the Inverse by the 
Gauss—Jordan Method 


To actually determine the inverse At ofa nonsingular n X n matrix A, we can use a 
variant of the Gauss elimination (Sec. 7.3), called the Gauss—Jordan elimination.* The 
idea of the method is as follows. 

Using A, we form n linear systems 


Axa) =e, *°*, AXm) = ea 


where the vectors €(1),°*:,@c) are the columns of the n X n unit matrix I; thus, 
ep =[1 0 =: O}',ea =[0 1 0 -=-- OJ", etc. These are n vector equations 
in the unknown vectors X(1),°**, X(m). We combine them into a single matrix equation 


3WILHELM JORDAN (1842-1899), German geodesist and mathematician. He did important geodesic work 
in Africa, where he surveyed oases. [See Althoen, S.C. and R. McLaughlin, Gauss—Jordan reduction: A brief 
history. American Mathematical Monthly, Vol. 94, No. 2 (1987), pp. 130-142.] 

We do not recommend it as a method for solving systems of linear equations, since the number of operations 
in addition to those of the Gauss elimination is larger than that for back substitution, which the Gauss—Jordan 
elimination avoids. See also Sec. 20.1. 
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AX = I, with the unknown matrix X having the columns X,4),---, X(,). Correspondingly, 
we combine the n augmented matrices [A e€gy],---,[A cy] into one wide n X 2n 
“augmented matrix” A= [A I]. Now multiplication of AX = I by A”? from the left 
gives X = A141 = A™1. Hence, to solve AX =I for X, we can apply the Gauss 
elimination to A = [A I]. This gives a matrix of the form [U H] with upper triangular 
U because the Gauss elimination triangularizes systems. The Gauss—Jordan method 
reduces U by further elementary row operations to diagonal form, in fact to the unit matrix 
I. This is done by eliminating the entries of U above the main diagonal and making the 
diagonal entries all | by multiplication (see Example 1). Of course, the method operates 
on the entire matrix [U H], transforming H into some matrix K, hence the entire[U H] 
to [IK]. This is the “augmented matrix” of IX = K. Now IX = X = A™?, as shown 
before. By comparison, K = A7/, so that we can read A“? directly from [I Kj. 
The following example illustrates the practical details of the method. 


Finding the Inverse of a Matrix by Gauss—Jordan Elimination 


Determine the inverse A7? of 


Solution. We apply the Gauss elimination (Sec. 7.3) to the following n X 2n = 3 X 6 matrix, where BLUE 
always refers to the previous matrix. 


[A Ij=| 3 -1 1 0 1 0 
1-1 3 41 0 o 4 
f-1 1 2 1 oo] 

0 2 7 3 1 0} Row2 + 3Rowl 


1] Row3— Rowl 


1 
a) 
| 


0 
0 2 a) 3 1 0 


0) 0 -5! -4 -I1 1| Row3 — Row2 


This is [UH] as produced by the Gauss elimination. Now follow the additional Gauss—Jordan steps, reducing 
U to I, that is, to diagonal form with entries | on the main diagonal. 


1-1 -2] -1 0 0 | -Rowl 
0 1 35) 15 O05 0 | 05Row2 
lo Oo 1 08 0.2 -0.2| —0.2Row3 
[1 -1 0 0.6 04 -04] Row1+2Row3 
0 1 0 =(3 -=—02 0.7 Row 2 — 3.5 Row 3 
10 0 1 08 0.2 0.2] 
[1 0 Of -07 02 03] Rowl + Row2 
0 1 Of -13 -02 0.7 
lo Oo 1 08 02 0.2] 
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The last three columns constitute A~!. Check: 


[-07 o2 03] Ji Oo oO 


=1.3° -=02 


=] 3 4 0.8 0.2 —0.2 0 0 1 


Hence AA71 = I. Similarly, AT la =]. jal 


Formulas for Inverses 


Since finding the inverse of a matrix is really a problem of solving a system of linear 
equations, it is not surprising that Cramer’s rule (Theorem 4, Sec. 7.7) might come into 
play. And similarly, as Cramer’s rule was useful for theoretical study but not for 
computation, so too is the explicit formula (4) in the following theorem useful for 
theoretical considerations but not recommended for actually determining inverse matrices, 
except for the frequently occurring 2 X 2 case as given in (4*). 


Inverse of a Matrix by Determinants 


The inverse of a nonsingular n X n matrix A = [aj] is given by 


Cy Car Cyr 
Cyn Coz Che 
(4) at =—1 _[cG,J" = ial 
det A det A 
Cin Con Cun 


where Cj, is the cofactor of aj, in det A (see Sec. 7.7). (CAUTION! Note well that 
; =i x ‘ 
in A’ ”, the cofactor Cj, occupies the same place as a;,; (not aj,) does in A.) 

In particular, the inverse of 


441 42 = 1 
(4*) A= is A 
a21 422 


We denote the right side of (4) by B and show that BA = I. We first write 
(5) BA = G = [xi] 


and then show that G = I. Now by the definition of matrix multiplication and because of 
the form of B in (4), we obtain (CAUTION! Cy, not C5) 


nN 
(6) ga = > (ay1CyK + ++ + AnitCnx)- 


1 
det A Ash det A 
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Now (9) and (10) in Sec. 7.7 show that the sum (---) on the right is D = det A when 
1 = k, and is zero when | # k. Hence 


1 
= det A = l, 
ae det A . 
sr =O C#K). 
In particular, for 1 = 2 we have in (4), in the first row, Cy, = deg, Coy = —ayg and, 
in the second row, Cjg = —dg1, Cog = ay1. This gives (4*). (2) 


The special case n = 2 occurs quite frequently in geometric and other applications. You 
may perhaps want to memorize formula (4*). Example 2 gives an illustration of (4%). 


Inverse of a 2 X 2 Matrix by Determinants 


3 1 1 4 -1 0.4 —-0.1 
A= es = a 
2 4 10] —2 3 —0.2 0.3 
Further Illustration of Theorem 2 
Using (4), find the inverse of 
f-1 1 2] 
A=} 3 -1 1 
| -1 3 4] 
Solution. We obtain det A 1(—7) — 1-13 + 2- 8 = 10, and in (4), 
=1.- 1 1 2 1 2 
Cu = S15 Cor = — =2, C31 = = 3, 
3 4 3 4 -1 1 
3 61 ={ 2 =1 2 
Cig = — = —13, Co = =—2, C3, = — =; 
=—1 4 -1 4 36 
a> =] =1 1 -1 1 
C13 = = 8, Co3 = — =2, C33 = == 2) 
-1 3 —-1 3 3. =] 
so that by (4), in agreement with Example 1, 
[-0.7 02 03] 
At=|-13 -02 0.7). | 
| 0.8 0.2 —-0.2 | 


Diagonal matrices A = [4;;,], aj, = 0 when j # k, have an inverse if and only if all 


aj; # 0. Then At is diagonal, too, with entries 1/aq4, ---, 1/ann. 
For a diagonal matrix we have in (4) 
Cc a a 1 
11 _ 22, mn _ Bis P| 
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Inverse of a Diagonal Matrix 


Let 
-—0.5 0 0 
A=!/ 0 4 0 
| 0 0 1 | 


Then we obtain the inverse A~+ by inverting each individual diagonal element of A, that is, by taking 1/(—0.5), 3, 
and i as the diagonal entries of ATL that is, 


[-2 0 0| 
At=| 0 0.25 0}. fe] 
0 Oo 1 


Products can be inverted by taking the inverse of each factor and multiplying these 
inverses in reverse order, 


(7) (AC\=: = C Aas 


Hence for more than two factors, 


(8) (AC:--PQ)-'=Q7!P7!.--c At, 


The idea is to start from (1) for AC instead of A, that is, AC(AC)~! = I, and multiply 
it on both sides from the left, first by A~4, which because of A7'A = I gives 


A TAC(AC)7? = C(AC)* 
=A“T=a, 


and then multiplying this on both sides from the left, this time by C71 and by using 
cUC=L 


C7 'C(AC)"* = (AC) = COAT, 
This proves (7), and from it, (8) follows by induction. ei] 
We also note that the inverse of the inverse is the given matrix, as you may prove, 
(9) (AT = A. 
Unusual Properties of Matrix Multiplication. 


Cancellation Laws 


Section 7.2 contains warnings that some properties of matrix multiplication deviate from 
those for numbers, and we are now able to explain the restricted validity of the so-called 
cancellation laws [2] and [3] below, using rank and inverse, concepts that were not yet 
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available in Sec. 7.2. The deviations from the usual are of great practical importance and 
must be carefully observed. They are as follows. 


[1] Matrix multiplication is not commutative, that is, in general we have 
AB = BA. 
[2] AB = 0 does not generally imply A = 0 or B = 0 (or BA = 0); for example, 
1 1) }-1 1 0 0 
; | r Al-k 0] 


[3] AC = AD does not generally imply C = D (even when A # 0). 


Complete answers to [2] and [3] are contained in the following theorem. 


Cancellation Laws 
Let A, B, C be n X n matrices. Then: 
(a) [frank A = n and AB = AC, then B = C. 


(b) /frank A = n, then AB = 0 implies B = 0. Hence if AB = 0, but A # 0 
as well as B # 0, then rank A < n and rank B < n. 


(c) If A is singular, so are BA and AB. 


(a) The inverse of A exists by Theorem 1. Multiplication by A’! from the left gives 
A”'AB = A“'AC, hence B = C. 

(b) Let rank A = n. Then At exists, and AB = 0 implies A TAB =B=0. Similarly 
when rank B = n. This implies the second statement in (b). 

(c;) Rank A <n by Theorem 1. Hence Ax = 0 has nontrivial solutions by Theorem 2 
in Sec. 7.5. Multiplication by B shows that these solutions are also solutions of BAx = 0, 
so that rank (BA) < n by Theorem 2 in Sec. 7.5 and BA is singular by Theorem 1. 

(C2) A’ is singular by Theorem 2(d) in Sec. 7.7. Hence B'A! is singular by part (cj), 
and is equal to (AB) by (10d) in Sec. 7.2. Hence AB is singular by Theorem 2(d) in 
Sec. 7.7. aa} 


Determinants of Matrix Products 


The determinant of a matrix product AB or BA can be written as the product of the 
determinants of the factors, and it is interesting that det AB = det BA, although AB # BA 
in general. The corresponding formula (10) is needed occasionally and can be obtained 
by Gauss—Jordan elimination (see Example |) and from the theorem just proved. 


Determinant of a Product of Matrices 


For any n X n matrices A and B, 


(10) det (AB) = det (BA) = det A det B. 
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If A or B is singular, so are AB and BA by Theorem 3(c), and (10) reduces to 0 = 0 by 
Theorem 3 in Sec. 7.7. 

Now let A and B be nonsingular. Then we can reduce A to a diagonal matrix A= [ajx] 
by Gauss—Jordan steps. Under these operations, det A retains its value, by Theorem | in 
Sec. 7.7, (a) and (b) [not (c)] except perhaps for a sign reversal in row interchanging when 
pivoting. But the same operations reduce AB to AB with the same effect on det (AB). 


Hence it remains to prove (10) for AB; written out, 


a4, O 0 bu biz Din 
" 0 dge 0 bo, beg bon 
AB = 
0 0 ann Dri bno Brn 
Q1b11 yi b12 Q1Pin 
Ggob01 — Gaabe0 Azabon 
Gnnbni Gnnbn2 Gnnbun 


We now take the determinant det (AB). On the right we can take out a factor G1 from 
the first row, Go from the second, ---, Gy», from the nth. But this product G1 G92 °-: Gyn 
equals det A because A is diagonal. The remaining determinant is det B. This proves (10) 
for det (AB), and the proof for det (BA) follows by the same idea. fail 


This completes our discussion of linear systems (Secs. 7.3—7.8). Section 7.9 on vector 
spaces and linear transformations is optional. Numeric methods are discussed in Secs. 
20.1-20.4, which are independent of other sections on numerics. 


PROBLEM SET 7-8 


1-10| INVERSE 


Check by using (1). 


0 1 O ! 2 3 
Find the i rse by G —Jord: by (4*) if mn = 2). 
ind the inverse by Gauss—Jordan (or by (4*) if n ) 7110 0 g.|4 5 6 
1.80 —2.32 cos 20 sin 20 [oO 0 1 [7 8 9 
1. 5, " a A 
—0.25 0.60 —sin 20 cos 20 0 8 O 3 3 3 
[0.3 -O.1 05 0 0 0.1 9/0 0 4 mje 2 4 
5\2 6 4 4/0 -04 0 12 0 0 a a 
Is 0 9 125 0 0 
EB 6 0 ad 0 é 11-18 | SOME GENERAL FORMULAS 
11. Inverse of the square. Verify (A?)~1 = (A- h2 for A 
5. | 2 1 0 6. 0 8 13 in Prob. 1. 
[5 4 1 | Oo 3 3 12. Prove the formula in Prob. 11. 
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13. Inverse of the transpose. Verify ( A')~1 = (A74! for 18. Row interchange. Same task as in Prob. 16 for the 


A in Prob. 1. matrix in Prob. 7. 

14. Prove the formula in Prob. 13. 

15. Inverse of the inverse. Prove that (A7/)~1 = A. 19-20 | FORMULA (4) 

16. Rotation. Give an application of the matrix in Prob. 2. Formula (4) is occasionally needed in theory. To understand 
that makes the form of the inverse obvious. it, apply it and check the result by Gauss—Jordan: 

17. Triangular matrix. Is the inverse of a triangular 19. In Prob. 3 
matrix always triangular (as in Prob. 5)? Give reason. 20. In Prob. 6 


7.9 Vector Spaces, Inner Product Spaces, 
Linear Transformations Optional 


DEFINITION 


We have captured the essence of vector spaces in Sec. 7.4. There we dealt with special 
vector spaces that arose quite naturally in the context of matrices and linear systems. The 
elements of these vector spaces, called vectors, satisfied rules (3) and (4) of Sec. 7.1 
(which were similar to those for numbers). These special vector spaces were generated 
by spans, that is, linear combination of finitely many vectors. Furthermore, each such 
vector had n real numbers as components. Review this material before going on. 

We can generalize this idea by taking all vectors with n real numbers as components 
and obtain the very important real n-dimensional vector space R". The vectors are known 
as “real vectors.” Thus, each vector in R” is an ordered n-tuple of real numbers. 

Now we can consider special values for n. For n = 2, we obtain R?, the vector space 
of all ordered pairs, which correspond to the vectors in the plane. For n = 3, we obtain 
R3, the vector space of all ordered triples, which are the vectors in 3-space. These vectors 
have wide applications in mechanics, geometry, and calculus and are basic to the engineer 
and physicist. 

Similarly, if we take all ordered n-tuples of complex numbers as vectors and complex 
numbers as scalars, we obtain the complex vector space C”, which we shall consider in 
Sec. 8.5. 

Furthermore, there are other sets of practical interest consisting of matrices, functions, 
transformations, or others for which addition and scalar multiplication can be defined in 
an almost natural way so that they too form vector spaces. 

It is perhaps not too great an intellectual jump to create, from the concrete model R”, 
the abstract concept of a real vector space V by taking the basic properties (3) and (4) 
in Sec. 7.1 as axioms. In this way, the definition of a real vector space arises. 


Real Vector Space 


A nonempty set V of elements a, b, - - - is called a real vector space (or real linear 
space), and these elements are called vectors (regardless of their nature, which will 
come out from the context or will be left arbitrary) if, in V, there are defined two 
algebraic operations (called vector addition and scalar multiplication) as follows. 

I. Vector addition associates with every pair of vectors a and b of V a unique 
vector of V, called the sum of a and b and denoted by a + b, such that the following 
axioms are satisfied. 
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1.1 Commutativity. For any two vectors a and b of V, 
at+tb=bta. 
I.2 Associativity. For any three vectors a, b, ¢ of V, 
(a+b) +c=at+(bt+oc) (written a + b + c). 


I.3 There is a unique vector in V, called the zero vector and denoted by 0, such 
that for every a in V, 


a+0=a. 


1.4 For every a in V there is a unique vector in V that is denoted by —a and is 
such that 


a+ (—a)=0. 

II. Scalar multiplication. The real numbers are called scalars. Scalar 
multiplication associates with every a in V and every scalar c a unique vector of V, 
called the product of c and a and denoted by ca (or ac) such that the following 
axioms are satisfied. 

Il.1 Distributivity. For every scalar c and vectors a and b in V, 

c(a + b) = cat cb. 
1.2 Distributivity. For all scalars c and k and every a in V, 


(c + ka =ca + ka. 


II.3 Associativity. For all scalars c and k and every a in V, 


c(ka) = (ck)a (written cka). 
11.4 For every a in V, 


la=a. 


If, in the above definition, we take complex numbers as scalars instead of real numbers, 
we obtain the axiomatic definition of a complex vector space. 

Take a look at the axioms in the above definition. Each axiom stands on its own: It 
is concise, useful, and it expresses a simple property of V. There are as few axioms as 
possible and together they express all the desired properties of V. Selecting good axioms 
is a process of trial and error that often extends over a long period of time. But once 
agreed upon, axioms become standard such as the ones in the definition of a real vector 
space. 
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EXAMPLE 1 


EXAMPLE 2 


The following concepts related to a vector space are exactly defined as those given in 
Sec. 7.4. Indeed, a linear combination of vectors aq), °* +, Aq) In a Vector space V is an 
expression 


c1aqy + +++ + Cmam (C1,°°*, Cm any scalars). 


These vectors form a linearly independent set (briefly, they are called linearly 
independent) if 


(1) cyaqy + +++ + Cmaan = 0 


implies that cy = 0,--+, Cm = 0. Otherwise, if (1) also holds with scalars not all zero, the 
vectors are called linearly dependent. 

Note that (1) with m = 1 is ca = 0 and shows that a single vector a is linearly 
independent if and only if a # 0. 

V has dimension 7, or is n-dimensional, if it contains a linearly independent set of n 
vectors, whereas any set of more than n vectors in V is linearly dependent. That set of 
n linearly independent vectors is called a basis for V. Then every vector in V can be 
written as a linear combination of the basis vectors. Furthermore, for a given basis, this 
representation is unique (see Prob. 2). 


Vector Space of Matrices 


The real 2 X 2 matrices form a four-dimensional real vector space. A basis is 


0 1 0 0 0 0 
» By= > Bo = » Bog = 
0 0 1 O 0 1 


because any 2 X 2 matrix A = [aj] has a unique representation A = 41By1 + 412By2 + d21Bo1 + d22Boo. 
Similarly, the real m X n matrices with fixed m and n form an mn-dimensional vector space. What is the 
dimension of the vector space of all 3 X 3 skew-symmetric matrices? Can you find a basis? B 


1 0 
Bu = 


0 0 


Vector Space of Polynomials 


The set of all constant, linear, and quadratic polynomials in x together is a vector space of dimension 3 with 
basis {1, x, x7} under the usual addition and multiplication by real numbers because these two operations give 
polynomials not exceeding degree 2. What is the dimension of the vector space of all polynomials of degree 
not exceeding a given fixed n? Can you find a basis? B 


If a vector space V contains a linearly independent set of n vectors for every n, no matter 
how large, then V is called infinite dimensional, as opposed to a finite dimensional 
(n-dimensional) vector space just defined. An example of an infinite dimensional vector 
space is the space of all continuous functions on some interval [a, b] of the x-axis, as we 
mention without proof. 


Inner Product Spaces 


If a and b are vectors in R”, regarded as column vectors, we can form the product a'b. 
This is a 1 X 1 matrix, which we can identify with its single entry, that is, with a number. 
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This product is called the inner product or dot product of a and b. Other notations for 
it are (a, b) and ae b. Thus 


by 


n 


a'b = (a,b) = aeb= [ay---an]] |= Sab) = ayby + +++ + nbn. 
i=1 
b,| 


We now extend this concept to general real vector spaces by taking basic properties of 
(a, b) as axioms for an “abstract inner product” (a, b) as follows. 


Real Inner Product Space 


A real vector space V is called a real inner product space (or real pre-Hilbert* 
space) if it has the following property. With every pair of vectors a and b in V there 
is associated a real number, which is denoted by (a, b) and is called the inner 
product of a and b, such that the following axioms are satisfied. 


I. For all scalars g, and gz and all vectors a, b, c in V, 
(qia + gob, c) = qy(a, c) + go(b, ce) (Linearity). 
II. For all vectors a and b in V, 
(a, b) = (b, a) (Symmetry). 
III. For every a in V, 


(a,a) 2 0, 
(Positive-definiteness). 
(a,a)=0 ifandonlyif a=0 


Vectors whose inner product is zero are called orthogonal. 
The /ength or norm of a vector in V is defined by 


(2) jal = V(a,a) (= 0). 


A vector of norm | is called a unit vector. 


4DAVID HILBERT (1862-1943), great German mathematician, taught at K6nigsberg and Gottingen and was 
the creator of the famous Gottingen mathematical school. He is known for his basic work in algebra, the calculus 
of variations, integral equations, functional analysis, and mathematical logic. His “Foundations of Geometry” 
helped the axiomatic method to gain general recognition. His famous 23 problems (presented in 1900 at the 
International Congress of Mathematicians in Paris) considerably influenced the development of modern 
mathematics. 

If V is finite dimensional, it is actually a so-called Hilbert space; see [GenRef7], p. 128, listed in App. 1. 
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EXAMPLE 3 


EXAMPLE 4 


From these axioms and from (2) one can derive the basic inequality 
(3) \(a, b)| S|lal] ||bl] © (Cauchy-Schwarz? inequality). 
From this follows 
(4) [a + b|| S |lal] + ||b]} (Triangle inequality). 
A simple direct calculation gives 
(5) ja + b|? + lla - b||? = 2(|\al/? ou || b||?) (Parallelogram equality). 


n-Dimensional Euclidean Space 


R” with the inner product 


(6) (a,b) = a'b = ayby +--+ + dnby 


(where both a and b are column vectors) is called the n-dimensional Euclidean space and is denoted by E” or 
again simply by R”. Axioms III hold, as direct calculation shows. Equation (2) gives the “Euclidean norm” 


(7) all = V(a,a) = Vala = Va? + ++) + a2. H 


An Inner Product for Functions. Function Space 


The set of all real-valued continuous functions f(x), g(x), --:on a given interval a S x S B is a real vector 
space under the usual addition of functions and multiplication by scalars (real numbers). On this “function 
space” we can define an inner product by the integral 


B 
(8) (f, 8) = [ro eco dx. 


a 


Axioms I-III can be verified by direct calculation. Equation (2) gives the norm 


B 
(9) Isl= VED = 4] | reoPae = 


Our examples give a first impression of the great generality of the abstract concepts of 
vector spaces and inner product spaces. Further details belong to more advanced courses 
(on functional analysis, meaning abstract modern analysis; see [GenRef7] listed in App. 
1) and cannot be discussed here. Instead we now take up a related topic where matrices 
play a central role. 


Linear Transformations 


Let X and Y be any vector spaces. To each vector x in X we assign a unique vector y in 
Y. Then we say that a mapping (or transformation or operator) of X into Y is given. 
Such a mapping is denoted by a capital letter, say F. The vector y in Y assigned to a vector 
x in X is called the image of x under F and is denoted by F (x) [or Fx, without parentheses]. 


°HERMANN AMANDUS SCHWARZ (1843-1921). German mathematician, known by his work in complex 
analysis (conformal mapping) and differential geometry. For Cauchy see Sec. 2.5. 
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F is called a linear mapping or linear transformation if, for all vectors v and x in X 
and scalars c, 


(10) F(v + x) = F(v) + F(x) 
F (cx) = cF (x). 


Linear Transformation of Space R” into Space R™ 


From now on we let X = R” and Y = R™. Then any real m X n matrix A = [aj~] gives 
a transformation of R” into R™, 


(11) y = Ax. 


Since A(u + x) = Au + Ax and A(cx) = cAx, this transformation is linear. 

We show that, conversely, every linear transformation F of R” into R”’ can be given 
in terms of an m X n matrix A, after a basis for R” and a basis for R’™ have been chosen. 
This can be proved as follows. 

Let €,1),°**, €q) be any basis for R”. Then every x in R” has a unique representation 


K = X1€(q) ae Xne(n)- 
Since F is linear, this representation implies for the image F(x): 
F(x) = F(x 1eq) ae eta oe Xn&n)) = x F (e€() a XyF (em): 


Hence F is uniquely determined by the images of the vectors of a basis for R”. We now 
choose for R” the “standard basis” 


1 0 0 
0 1 0 
(12) ea =|9, e~=lO, -, ea =] 0 
0 0 1 


where €,;) has its jth component equal to | and all others 0. We show that we can now 
determine an m X n matrix A = [a;;,] such that for every x in R” and image y = F(x) in 
R's 


y = F(x) = Ax. 


dd) _ 


Indeed, from the image y F (eq) of €3) we get the condition 


a 

Yi a1 can {i} 1 
a 

Ys day *** an, | | O 


Ym Am1 apes amm 
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EXAMPLE 5 


EXAMPLE 6 


from which we can determine the first column of A, namely a1, = yi”, dg, = ye, Hay 
ami = yee. Similarly, from the image of eg) we get the second column of A, and so on. 


This completes the proof. B 


We say that A represents F, or is a representation of F, with respect to the bases for R” 
and R”’. Quite generally, the purpose of a “representation” is the replacement of one 
object of study by another object whose properties are more readily apparent. 

In three-dimensional Euclidean space E® the standard basis is usually written ep =i 
ea) = j. ea) = k. Thus, 


1 0 0 
(13) i=/0O}, j=/1], k=|0 
0 0 1 


These are the three unit vectors in the positive directions of the axes of the Cartesian 
coordinate system in space, that is, the usual coordinate system with the same scale of 
measurement on the three mutually perpendicular coordinate axes. 


Linear Transformations 


Interpreted as transformations of Cartesian coordinates in the plane, the matrices 


Dea ale lie al 


represent a reflection in the line x2 = x4, a reflection in the x-axis, a reflection in the origin, and a stretch 
(when a > 1, or a contraction when 0 < a < 1) in the x,-direction, respectively. a] 


Linear Transformations 


Our discussion preceding Example 5 is simpler than it may look at first sight. To see this, find A representing 
the linear transformation that maps (x1, x2) onto (2x, — 5x2, 3x, + 4x2). 


Solution. Obviously, the transformation is 


yr = 2x1 — 5x2 


yo = 3x, + 4x9. 


2 =S xX. 
7 3 4 || x2 7 
If A in (11) is square, n X n, then (11) maps R” into R”. If this A is nonsingular, so that 


A7? exists (see Sec. 7.8), then multiplication of (11) by A~? from the left and use of 
A“'A = I gives the inverse transformation 


From this we can directly see that the matrix is 


2 =35 J 2X4 Sxo 


A= : 


3 + 


Check: | 


y2 3x1 + 4xo 


(14) x=A'ly. 


It maps every y = Yo onto that x, which by (11) is mapped onto yo. The inverse of a linear 
transformation is itself linear, because it is given by a matrix, as (14) shows. 
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Composition of Linear Transformations 


We want to give you a flavor of how linear transformations in general vector spaces work. 
You will notice, if you read carefully, that definitions and verifications (Example 7) strictly 
follow the given rules and you can think your way through the material by going in a 
slow systematic fashion. 

The last operation we want to discuss is composition of linear transformations. Let X, 
Y, W be general vector spaces. As before, let F be a linear transformation from X to Y. 
Let G be a linear transformation from W to X. Then we denote, by H, the composition 
of F and G, that is, 


H=F°G= FG = FG), 


which means we take transformation G and then apply transformation F to it (in that 
order!, i.e. you go from left to right). 

Now, to give this a more concrete meaning, if we let w be a vector in W, then G(w) 
is a vector in X and F(G(w)) is a vector in Y. Thus, H maps W to Y, and we can write 


(15) H(w) = (F ° G) (w) = (FG) (w) = F(G(w)), 


which completes the definition of composition in a general vector space setting. But is 
composition really linear? To check this we have to verify that H, as defined in (15), obeys 
the two equations of (10). 


The Composition of Linear Transformations Is Linear 
To show that H is indeed linear we must show that (10) holds. We have, for two vectors wy, Wo in W, 
H(wy + Wo) = (F° G)(wy + Wo) 


= F(G(w1 + we)) 


= F(G(w) + G(wa)) (by linearity of G) 
= F(G(wy)) + F(G(wo)) (by linearity of F) 
= (F ° G)(wy) + (Fe G)(wa) (by (15)) 

= H(w1) + H(Wo) (by definition of H). 


Similarly, H(cwe) = (F ° G)(cW2) = F(G(cW2)) = F(c(G(we)) 
= cF(G(We)) = c(F ° G)(we) = cH(Wwa). a 


We defined composition as a linear transformation in a general vector space setting and 
showed that the composition of linear transformations is indeed linear. 

Next we want to relate composition of linear transformations to matrix multiplication. 

To do so we let X = R", Y = R"™, and W = R?. This choice of particular vector spaces 
allows us to represent the linear transformations as matrices and form matrix equations, 
as was done in (11). Thus F can be represented by a general real m X n matrix A = [ajx| 
and G by ann X p matrix B = [bj]. Then we can write for F, with column vectors x 
with n entries, and resulting vector y, with m entries 


(16) y = Ax 
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and similarly for G, with column vector w with p entries, 

(17) x = Bw. 

Substituting (17) into (16) gives 

(18) y = Ax = A(Bw) = (AB)w = ABw = Cw where C = AB. 


This is (15) in a matrix setting, this is, we can define the composition of linear transfor- 
mations in the Euclidean spaces as multiplication by matrices. Hence, the real m X p 
matrix C represents a linear transformation H which maps R? to R” with vector w, a 
column vector with p entries. 


Remarks. Our discussion is similar to the one in Sec. 7.2, where we motivated the 
“unnatural” matrix multiplication of matrices. Look back and see that our current, more 
general, discussion is written out there for the case of dimension m = 2,n = 2, andp = 2. 
(You may want to write out our development by picking small distinct dimensions, such 
as m = 2,n = 3, and p = 4, and writing down the matrices and vectors. This is a trick 
of the trade of mathematicians in that we like to develop and test theories on smaller 
examples to see that they work.) 


Linear Transformations. Composition 


In Example 5 of Sec. 7.9, let A be the first matrix and B be the fourth matrix with a > 1. Then, applying B to 
a vector w = [wy we]!, stretches the element w, by a in the x, direction. Next, when we apply A to the 
“stretched” vector, we reflect the vector along the line x; = xg, resulting in a vector y = [We ay)". But this 
represents, precisely, a geometric description for the composition H of two linear transformations F and G 
represented by matrices A and B. We now show that, for this example, our result can be obtained by 
straightforward matrix multiplication, that is, 


and as in (18) calculate 


ABw = 


which is the same as before. This shows that indeed AB = C, and we see the composition of linear 
transformations can be represented by a linear transformation. It also shows that the order of matrix multiplication 
is important (!). You may want to try applying A first and then B, resulting in BA. What do you see? Does it 
make geometric sense? Is it the same result as AB? B 


We have learned several abstract concepts such as vector space, inner product space, 
and linear transformation. The introduction of such concepts allows engineers and 
scientists to communicate in a concise and common language. For example, the concept 
of a vector space encapsulated a lot of ideas in a very concise manner. For the student, 
learning such concepts provides a foundation for more advanced studies in engineering. 

This concludes Chapter 7. The central theme was the Gaussian elimination of Sec. 7.3 
from which most of the other concepts and theory flowed. The next chapter again has a 
central theme, that is, eigenvalue problems, an area very rich in applications such as in 
engineering, modern physics, and other areas. 
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PROBLEM SET 7-9 


1. Basis. Find three bases of R?. 


2. Uniqueness. Show that the representation v = ca) 
+ +++ + Cpa) of any given vector in an n-dimensional 
vector space V in terms of a given basis a4, °**, Ac) 
for V is unique. Hint. Take two representations and 
consider the difference. 


3-10} VECTOR SPACE 


(More problems in Problem Set 9.4.) Is the given set, taken 
with the usual addition and scalar multiplication, a vector 
space? Give reason. If your answer is yes, find the dimen- 
sion and a basis. 


3. All vectors in R® satisfying —v, + 2v2 + 3v3 = 0, 
4v1 + Vg 4 V3 = 0. 
4. All skew-symmetric 3 X 3 matrices. 


5. All polynomials in x of degree 4 or less with 
nonnegative coefficients. 


6. All functions y(x) = acos 2x + b sin 2x with arbitrary 
constants a and b. 


7. All functions y(x) = (ax + b)e~* with any constant a 
and b. 


8. All n X n matrices A with fixed n and det A = 0. 
9. All 2 x 2 matrices [aj,] with a1 + agg = 0. 


10. All3 X 2 matrices [aj] with first column any multiple 
of [3 0 —5]". 


11-14} LINEAR TRANSFORMATIONS 


Find the inverse transformation. Show the details. 
11. yy = 0.5x1 — 0.5x2 12. yy = 3x1 + 2x9 


yo = 1.5x4 _ 2.5x9 yo = Ax, Tr Xe 


13. 


14. 


y= 5x41 he 3x9 = 3x3 


yo = 3x4 aig 2x9 = 2x3 


yg = 2x1 - Xo + 2x3 


yi 0.2x4 _ O.1x5 
y2 = _ 0.2x9 Tr 0.1x3 
Jy3 = O.1x4 ar 0.1x3 


15-20} EUCLIDEAN NORM 


Find the Euclidean norm of the vectors: 


15 


[3 1 -4]' 16.[5 3 —2 —3]' 
(1 0 0 1-1 0 -1 JT 
[-4 8 -1]" 19. [3 3 3 Of 
[5 > eee a)" 

2 2 2 2 


21-25| INNER PRODUCT. ORTHOGONALITY 


21. 


22. 


23. 


24. 


25. 


Orthogonality. For what value(s) of k are the vectors 
[2 § —4 Oj’ and[5 k 0 4]" orthogonal? 


Orthogonality. Find all vectors in R? orthogonal to 
[2 0 1]. Do they form a vector space? 


Triangle inequality. Verify (4) for the vectors in 
Probs. 15 and 18. 


Cauchy-Schwarz inequality. Verify (3) for the 
vectors in Probs. 16 and 19. 


Parallelogram equality. Verify (5) for the first two 
column vectors of the coefficient matrix in Prob. 13. 


CHAPTER 7 REVIEW QUESTIONS AND PROBLEMS 


1. What properties of matrix multiplication differ from 
those of the multiplication of numbers? 

2. Let A bea 100 X 100 matrix and Ba 100 X 50 matrix. 
Are the following expressions defined or not? A + B, 
A’, B’, AB, BA, AA‘, B'A, B'B, BB’, B'AB. Give 
reasons. 

3. Are there any linear systems without solutions? With 
one solution? With more than one solution? Give 
simple examples. 

4. Let C be 10 X 10 matrix and a a column vector with 
10 components. Are the following expressions defined 
or not? Ca, Ca, Ca’, aC, aC, (Ca‘)". 


5. 


Motivate the definition of matrix multiplication. 


6. Explain the use of matrices in linear transformations. 


10. 


. How can you give the rank of a matrix in terms of row 


vectors? Of column vectors? Of determinants? 


. What is the role of rank in connection with solving 


linear systems? 


. What is the idea of Gauss elimination and back 


substitution? 


What is the inverse of a matrix? When does it exist? 
How would you determine it? 


Chapter 7 Review Questions and Problems 


give reason why they are not defined, when 


3 1 -3 0 
A=| 1 4.) 2], B=|-4 
[3 2 5 | -1 
2 | 9 
u= O}, v=|-3 
[+5 | 3 
11. AB, BA 12. A’, BT’ 
13. Au, wA 14. u'v, uv! 
15. u'Au, v'By 16. A+, Bt 


17. detA, detA?, (det A)*, detB 
18. (A2)-1, (A742 19. AB— BA 
20. (A + A')(B — B') 


21-28| LINEAR SYSTEMS 


Showing the details, find all solutions or indicate that no 


solution exists. 
21. 4y+ z=0 


12x — 5y — 3z = 34 


=6x + 4z2= 
22. 5x —3y + z=7 
2x + 3y- z=0 
8x + Gy — 3z = 2 
23. 9x + 3y — 6z = 60 


Ix — 4y + 8z= 4 
24. —6x + 39y — 9z = —-12 


2x- 13y+3z= 4 
25. 0.3x — 0.7y + 1.3z = 3.24 
0.9y — 0.8z = —2.53 
0.7z = 1.19 


26. 2x+3y- 72=3 


—4x — 6y + 142 =7 


11-20} MATRIX AND VECTOR CALCULATIONS 


Showing the details, calculate the following expressions or 
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27 x+2y= 6 
3x + 5y= 20 
—4x+ y= -42 

28. —8x +2z= 

6y + 4z = 3 
12x + 2y = 
29-32} RANK 


Determine the ranks of the coefficient matrix and the 
augmented matrix and state how many solutions the linear 
system will have. 


29. In Prob. 23 
30. In Prob. 24 
31. In Prob. 27 
32. In Prob. 26 


33-35 | NETWORKS 


Find the currents. 


33. 202 


34. 220V 
—_ 
5a 1, 
I; 
i. 109 


35. 


20 Q 
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SUMMARY—OF CHAPTER TL 
Linear Algebra: Matrices, Vectors, Determinants. 


Linear Systems 


An m Xn matrix A = [aj] is a rectangular array of numbers or functions 
(“entries,” “elements”) arranged in m horizontal rows and n vertical columns. If 
m =n, the matrix is called square. A | X n matrix is called a row vector and an 
m X 1 matrix a column vector (Sec. 7.1). 

The sum A + B of matrices of the same size (i.e., both m X n) is obtained by 
adding corresponding entries. The product of A by a scalar c is obtained by 
multiplying each aj, by c (Sec. 7.1). 

The product C = AB of an m X n matrix A by an r X p matrix B = [bjx] is 
defined only when r = n, and is the m X p matrix C = [cj] with entries 


(1) biz + djoboy + +++ + ajnb (row j of A times 
Cjik = a; a; oa a; 
gk GlP1k HW QU2K Hine nk Poianeat B). 


This multiplication is motivated by the composition of linear transformations 
(Secs. 7.2, 7.9). It is associative, but is not commutative: if AB is defined, BA may 
not be defined, but even if BA is defined, AB # BA in general. Also AB = 0 may 
not imply A = 0 or B = 0 or BA = 0 (Secs. 7.2, 7.8). Illustrations: 


(1 21] | =), [1 2]= 
4 4 4 8 


The transpose A’ of a matrix A = [a;,] is AT = [a,j]; rows become columns 
and conversely (Sec. 7.2). Here, A need not be square. If it is and A = A', then A 
is called symmetric; if A = —A’, it is called skew-symmetric. For a product, 
(AB)' = B'AT (Sec. 7.2). 

A main application of matrices concerns linear systems of equations 


(2) Ax =b (Sec. 7.3) 


(m equations inn unknowns x4, ---,X,; A and b given). The most important method 
of solution is the Gauss elimination (Sec. 7.3), which reduces the system to 
“triangular” form by elementary row operations, which leave the set of solutions 
unchanged. (Numeric aspects and variants, such as Doolittle’s and Cholesky’s 
methods, are discussed in Secs. 20.1 and 20.2.) 


Summary of Chapter 7 
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Cramer’s rule (Secs. 7.6, 7.7) represents the unknowns in a system (2) of n 
equations in n unknowns as quotients of determinants; for numeric work it is 
impractical. Determinants (Sec. 7.7) have decreased in importance, but will retain 
their place in eigenvalue problems, elementary geometry, etc. 

The inverse A~* of a square matrix satisfies AA~' = A7‘A = I. It exists if and 
only if det A # 0. It can be computed by the Gauss—Jordan elimination (Sec. 7.8). 

The rank r of a matrix A is the maximum number of linearly independent rows 
or columns of A or, equivalently, the number of rows of the largest square submatrix 
of A with nonzero determinant (Secs. 7.4, 7.7). 

The system (2) has solutions if and only if rank A = rank [Ab], where [Ab] 
is the augmented matrix (Fundamental Theorem, Sec. 7.5). 

The homogeneous system 


(3) Ax = 0 


has solutions x # 0 (“nontrivial solutions”) if and only if rank A < n, in the case 
m = n equivalently if and only if det A = 0 (Secs. 7.6, 7.7). 


Vector spaces, inner product spaces, and linear transformations are discussed in 
Sec. 7.9. See also Sec. 7.4. 
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CHAPTER 8 


Linear Algebra: 
Matrix Eigenvalue Problems 


A matrix eigenvalue problem considers the vector equation 
(1) Ax = Ax. 


Here A is a given square matrix, A an unknown scalar, and x an unknown vector. In a 
matrix eigenvalue problem, the task is to determine A’s and x’s that satisfy (1). Since 
x = 0 is always a solution for any A and thus not interesting, we only admit solutions 
with x # 0. 

The solutions to (1) are given the following names: The A’s that satisfy (1) are called 
eigenvalues of A and the corresponding nonzero x’s that also satisfy (1) are called 
eigenvectors of A. 

From this rather innocent looking vector equation flows an amazing amount of relevant 
theory and an incredible richness of applications. Indeed, eigenvalue problems come up 
all the time in engineering, physics, geometry, numerics, theoretical mathematics, biology, 
environmental science, urban planning, economics, psychology, and other areas. Thus, in 
your career you are likely to encounter eigenvalue problems. 

We start with a basic and thorough introduction to eigenvalue problems in Sec. 8.1 and 
explain (1) with several simple matrices. This is followed by a section devoted entirely 
to applications ranging from mass-—spring systems of physics to population control models 
of environmental science. We show you these diverse examples to train your skills in 
modeling and solving eigenvalue problems. Eigenvalue problems for real symmetric, 
skew-symmetric, and orthogonal matrices are discussed in Sec. 8.3 and their complex 
counterparts (which are important in modern physics) in Sec. 8.5. In Sec. 8.4 we show 
how by diagonalizing a matrix, we obtain its eigenvalues. 


COMMENT. Numerics for eigenvalues (Secs. 20.6-20.9) can be studied immediately 
after this chapter. 


Prerequisite: Chap. 7. 
Sections that may be omitted in a shorter course: 8.4, 8.5. 
References and Answers to Problems: App. | Part B, App. 2. 
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The following chart identifies where different types of eigenvalue problems appear in the 
book. 


Topic Where to find it 
Matrix Eigenvalue Problem (algebraic eigenvalue problem) Chap. 8 
Eigenvalue Problems in Numerics Secs. 20.6—20.9 
Eigenvalue Problem for ODEs (Sturm—Liouville problems) Secs. 11.5, 11.6 
Eigenvalue Problems for Systems of ODEs Chap. 4 
Eigenvalue Problems for PDEs Secs. 12.3-12.11 


8.1 The Matrix Eigenvalue Problem. Determining 


Eigenvalues and Eigenvectors 


Consider multiplying nonzero vectors by a given square matrix, such as 


6 31/5 33 6 31/3 30 

4 7\l1] |27) [4 7]L4] [40 
We want to see what influence the multiplication of the given matrix has on the vectors. 
In the first case, we get a totally new vector with a different direction and different length 
when compared to the original vector. This is what usually happens and is of no interest 
here. In the second case something interesting happens. The multiplication produces a 
vector [30 4o]" = 10[3 Al", which means the new vector has the same direction as 
the original vector. The scale constant, which we denote by A is 10. The problem of 
systematically finding such X’s and nonzero vectors for a given square matrix will be the 
theme of this chapter. It is called the matrix eigenvalue problem or, more commonly, the 
eigenvalue problem. 


We formalize our observation. Let A = [aj,] be a given nonzero square matrix of 
dimension n X n. Consider the following vector equation: 


(1) Ax = dx. 


The problem of finding nonzero x’s and A’s that satisfy equation (1) is called an eigenvalue 
problem. 


Remark. So A is a given square (!) matrix, x is an unknown vector, and A is an 
unknown scalar. Our task is to find A’s and nonzero x’s that satisfy (1). Geometrically, 
we are looking for vectors, x, for which the multiplication by A has the same effect as 
the multiplication by a scalar A; in other words, Ax should be proportional to x. Thus, 
the multiplication has the effect of producing, from the original vector x, a new vector 
Ax that has the same or opposite (minus sign) direction as the original vector. (This was 
all demonstrated in our intuitive opening example. Can you see that the second equation in 
that example satisfies (1) with A = 10 and x = [3 Ay", and A the given 2 X 2 matrix? 
Write it out.) Now why do we require x to be nonzero? The reason is that x = 0 is 
always a solution of (1) for any value of A, because AO = 0. This is of no interest. 
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We introduce more terminology. A value of A, for which (1) has a solution x # 0, is 
called an eigenvalue or characteristic value of the matrix A. Another term for A is a latent 
root. (“Eigen” is German and means “proper” or “characteristic.”). The corresponding 
solutions x # 0 of (1) are called the eigenvectors or characteristic vectors of A 
corresponding to that eigenvalue A. The set of all the eigenvalues of A is called the 
spectrum of A. We shall see that the spectrum consists of at least one eigenvalue and at 
most of n numerically different eigenvalues. The largest of the absolute values of the 
eigenvalues of A is called the spectral radius of A, a name to be motivated later. 


How to Find Eigenvalues and Eigenvectors 


Now, with the new terminology for (1), we can just say that the problem of determining 
the eigenvalues and eigenvectors of a matrix is called an eigenvalue problem. (However, 
more precisely, we are considering an algebraic eigenvalue problem, as opposed to an 
eigenvalue problem involving an ODE or PDE, as considered in Secs. 11.5 and 12.3, or 
an integral equation.) 

Eigenvalues have a very large number of applications in diverse fields such as in 
engineering, geometry, physics, mathematics, biology, environmental science, economics, 
psychology, and other areas. You will encounter applications for elastic membranes, 
Markov processes, population models, and others in this chapter. 

Since, from the viewpoint of engineering applications, eigenvalue problems are the most 
important problems in connection with matrices, the student should carefully follow our 
discussion. 

Example 1 demonstrates how to systematically solve a simple eigenvalue problem. 


Determination of Eigenvalues and Eigenvectors 


We illustrate all the steps in terms of the matrix 


—5 2 
A= : 
2 =2 
Solution. (a) Eigenvalues. These must be determined first. Equation (1) is 
=) 21) X41 Xy —5x1, + 2x9 = Axy 
Ax = =X : in components, 
2 =2 x2 x2 2x4 _ 2x92 = Axo. 
Transferring the terms on the right to the left, we get 
(=5 — A)x1 oe 2x9 =0 
@*) 
2x1 - (=2 = A)xo = 0. 
This can be written in matrix notation 
(3*) (A — ADx = 0 


because (1) is Ax — Ax = Ax — AIx = (A — ADx = 0, which gives (3*). We see that this is a homogeneous 
linear system. By Cramer’s theorem in Sec. 7.7 it has a nontrivial solution x # 0 (an eigenvector of A we are 
looking for) if and only if its coefficient determinant is zero, that is, 


=o: =A 2 


(4*) D(A) = det(A — AD = 


(-5 — Al(-2 -— A) -—4 = N+ 714+6=0. 
2 =2= 4 
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We call D(A) the characteristic determinant or, if expanded, the characteristic polynomial, and D(A) = 0 


the characteristic equation of A. The solutions of this quadratic equation are Ay = —1 and Ag = —6. These 
are the eigenvalues of A. 
(b,) Eigenvector of A corresponding to \,. This vector is obtained from (2*) with A = Ay = —1, that is, 
—4x1 + 2x2 =0 
2x4 — xXx2= 0. 


A solution is x2 = 2x , as we see from either of the two equations, so that we need only one of them. This 
determines an eigenvector corresponding to A; = —1 up to a scalar multiple. If we choose x; = 1, we obtain 
the eigenvector 


1 —5 2/1 -1 
x, = , Check: Ax, = = = (—1)xy = A1X}. 
2 2 —2)|2 —2 


(b2) Eigenvector of A corresponding to Ag. For A = Ay = —6, equation (2*) becomes 


x17 2x9 = 0 


2x1 + 4xq = 0. 


A solution is x2 = —x 4/2 with arbitrary x1. If we choose x, = 2, we get x2 = —1. Thus an eigenvector of A 
corresponding to Ap = —6 is 


2 =) 2 2 =12 
Xo = F Check: Ax = (—6)x2 = Aoxe. 
= 2 -=2||<1 6 


For the matrix in the intuitive opening example at the start of Sec. 8.1, the characteristic equation is 
A? — 13A + 30 = (A — 10)(A — 3) = 0. The eigenvalues are {10, 3}. Corresponding eigenvectors are 
(3. 4)" and[-1 1]', respectively. The reader may want to verify this. a] 


This example illustrates the general case as follows. Equation (1) written in components is 


44X14 + ++ + AyynXy = AXy 
dgyXy + +++ + donXyn = AXQ 
AnyXy + ++ + AnnXn = AXn. 


Transferring the terms on the right side to the left side, we have 


(44. — AMX + yaXQ Ft FH A NnXy = — = O 
ag4X 1 + (doo = A)x9 tosee Ht aanxXn = 0 

(2) 
An1X1 + An2X2 + + (dyn — A)Xn = 0 


In matrix notation, 


(3) (A — ADx = 0. 
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THEOREM 2 
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By Cramer’s theorem in Sec. 7.7, this homogeneous linear system of equations has a 
nontrivial solution if and only if the corresponding determinant of the coefficients is zero: 


a41—A a2 zee An 
21 dag — A+ dan 
(4) D(A) = det(A — AD = = 0, 
ani aAn2 nites ann — X 


A — ALis called the characteristic matrix and D(A) the characteristic determinant of 

A. Equation (4) is called the characteristic equation of A. By developing D(A) we obtain 

a polynomial of nth degree in A. This is called the characteristic polynomial of A. 
This proves the following important theorem. 


Eigenvalues 
The eigenvalues of a square matrix A are the roots of the characteristic equation 
(4) of A. 

Hence ann X n matrix has at least one eigenvalue and at most n numerically 
different eigenvalues. 


For larger n, the actual computation of eigenvalues will, in general, require the use 
of Newton’s method (Sec. 19.2) or another numeric approximation method in Secs. 
20.7-20.9. 

The eigenvalues must be determined first. Once these are known, corresponding 
eigenvectors are obtained from the system (2), for instance, by the Gauss elimination, 
where A is the eigenvalue for which an eigenvector is wanted. This is what we did in 
Example | and shall do again in the examples below. (To prevent misunderstandings: 
numeric approximation methods, such as in Sec. 20.8, may determine eigenvectors first.) 

Eigenvectors have the following properties. 


Eigenvectors, Eigenspace 


If w and x are eigenvectors of a matrix A corresponding to the same eigenvalue i, 
so are W + x (provided x # —w) and kx for any k # 0. 

Hence the eigenvectors corresponding to one and the same eigenvalue of A, 
together with 0, form a vector space (cf. Sec. 7.4), called the eigenspace of A 
corresponding to that X. 


Aw = Aw and Ax = Ax imply A(w + x) = Aw + Ax = Aw + Ax = A(w + x) and 
A (kw) = k(Aw) = k(Aw) = A(kw); hence A(kw + €x) = A(kw + €x). | 


In particular, an eigenvector x is determined only up to a constant factor. Hence we 
can normalize x, that is, multiply it by a scalar to get a unit vector (see Sec. 7.9). For 
instance, x} = [1 2]' in Example 1 has the length ||x,|| = V1? + 2? = V5; hence 
[1/ V5 2/ V5)" is a normalized eigenvector (a unit eigenvector). 
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Examples 2 and 3 will illustrate that ann X n matrix may have n linearly independent 
eigenvectors, or it may have fewer than n. In Example 4 we shall see that a real matrix 
may have complex eigenvalues and eigenvectors. 


EXAMPLE 2 Multiple Eigenvalues 


Find the eigenvalues and eigenvectors of 


A=]|/ 2 1 -6 
i) 2 0 


Solution. For our matrix, the characteristic determinant gives the characteristic equation 


A? — 4? + 214 + 45 = 0. 


The roots (eigenvalues of A) are Ay = 5, Ag = Ag = —3. (If you have trouble finding roots, you may want to 
use a root finding algorithm such as Newton’s method (Sec. 19.2). Your CAS or scientific calculator can find 
roots. However, to really learn and remember this material, you have to do some exercises with paper and pencil.) 
To find eigenvectors, we apply the Gauss elimination (Sec. 7.3) to the system (A — AI)x = 0, first with A = 5 


and then with A = —3. For A = 5 the characteristic matrix is 
[-7 2 -3] | -7 2 -3|] 
A-AI=A-—S5I=| 2 4 6 |. It row-reduces to 0 28 -# 
peal -=2- =3, 0 0 0 | 
Hence it has rank 2. Choosing x3 = —1 we have xy = 2 from Ay. Br = 0 and then x; = | from 
—7Tx 1 + 2x29 — 3x3 = 0. Hence an eigenvector of A corresponding to A = Sis x, = [1 2 1". 
For A = —3 the characteristic matrix 
. 1 2 -3] [1 2 -3] 
A-AIL=A+3I=| 2 4 —-6 row-reduces to 0 0 0 
[-1 -2 3 | [0 0 0| 
Hence it has rank 1. From x1 + 2x9 — 3x3 = 0 we have x1 = —2xg + 3x3. Choosing x2 = 1,x3 = 0 and 
x2 = 0,x3 = 1, we obtain two linearly independent eigenvectors of A corresponding to A = —3 [as they must 
exist by (5), Sec. 7.5, with rank = | and n = 3], 
4 
Xp = 1 
0 
and 
Fa] 
x3 = 0}. |_| 


The order M) of an eigenvalue A as a root of the characteristic polynomial is called the 
algebraic multiplicity of A. The number m), of linearly independent eigenvectors 
corresponding to A is called the geometric multiplicity of A. Thus m, is the dimension 
of the eigenspace corresponding to this A. 
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PROOF 


CHAP. 8 Linear Algebra: Matrix Eigenvalue Problems 


Since the characteristic polynomial has degree n, the sum of all the algebraic 
multiplicities must equal n. In Example 2 for A = —3 we have m, = M) = 2. In general, 
my = M,, as can be shown. The difference A, = M, — m, is called the defect of A. 
Thus A_3 = 0 in Example 2, but positive defects A, can easily occur: 


Algebraic Multiplicity, Geometric Multiplicity. Positive Defect 


The characteristic equation of the matrix 


0 1 —A 1 


=)? =0. 


| is det (A — AD = | 


0 0 0 —-A 


Hence A = 0 is an eigenvalue of algebraic multiplicity Mp = 2. But its geometric multiplicity is only mop = 1, 
since eigenvectors result from —0x ; + x2 = 0, hence x = 0, in the form [x, 0]. Hence for A = 0 the defect 


is Ao =1. 
Similarly, the characteristic equation of the matrix 


3 22 3-A 2 


ia 
ll 


| is det (A — AD = | 


0 3 0 3—X 


Hence A = 3 is an eigenvalue of algebraic multiplicity M3 = 2, but its geometric multiplicity is only m3 = 1, 
since eigenvectors result from 0x, + 2x2 = 0 in the form [x, oy". a 


Real Matrices with Complex Eigenvalues and Eigenvectors 


Since real polynomials may have complex roots (which then occur in conjugate pairs), a real matrix may have 
complex eigenvalues and eigenvectors. For instance, the characteristic equation of the skew-symmetric matrix 


0 1 —r 1 
A= is det(A—AD = =’+1=0 
-1 0 —-l —-A 
It gives the eigenvalues Ay = i(= V—1), Ag i. Eigenvectors are obtained from —ix, + x2 = 0 and 


ix, + xg = 0, respectively, and we can choose x; = | to get 


In the next section we shall need the following simple theorem. 


Eigenvalues of the Transpose 


The transpose A" of a square matrix A has the same eigenvalues as A. 


Transposition does not change the value of the characteristic determinant, as follows from 
Theorem 2d in Sec. 7.7. a 


Having gained a first impression of matrix eigenvalue problems, we shall illustrate their 
importance with some typical applications in Sec. 8.2. 
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1-16] EIGENVALUES, EIGENVECTORS 


Find the eigenvalues. Find the corresponding eigenvectors. 
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PROBLEM SET 8-1 


Use the given A or factor in Probs. 11 and 15. 


15. , A+) 
4 


16. 
2 4A =] 2 
| 0 2 =2 3 
17-20| LINEAR TRANSFORMATIONS 


AND EIGENVALUES 
Find the matrix A in the linear transformation y = Ax, 
where x = [x1 xo]! (x = [x1 Xe x3") are Cartesian 
coordinates. Find the eigenvalues and eigenvectors and 


30 0 0 
1. 2, 
10 —0.6 [0 | 
[5 -2 1 ] 
%, 4. 
\9 -6 [2 | 
| oe 3 1 ] 
5. 6. 
|-3 0 [0 | 
lo 1 l a »b 
7. 8. 
[0 0 [-b a 
| 0.8 —0.6 | cos 6 —sin#é 
9 10 
| 0.6 0.8 | sin 6 cos 6 
[6 2 23 
1./ 2 5 Of, Aa=3 
1-2 0 7 
fs. § & fie 4 3 
2.1/0 4 6 1s oF =8 
aa: ae 15 4 #7 
[2 @ =1 
14/0 43 O 
I} Oo 4 


explain their geometric meaning. 


17. 


18. 
19. 


Counterclockwise rotation through the angle 77/2 about 
the origin in R?. 

Reflection about the x-axis in R?. 

Orthogonal projection (perpendicular projection) of R? 
onto the x9-axis. 


20. Orthogonal projection of R? onto the plane x2 = xj. 
21-25| GENERAL PROBLEMS 
21. Nonzero defect. Find further 2 x 2 and 3 x 3 


22. 


23. 


24. 


25. 


matrices with positive defect. See Example 3. 
Multiple eigenvalues. Find further 2 x 2 and 3 X 3 
matrices with multiple eigenvalues. See Example 2. 
Complex eigenvalues. Show that the eigenvalues of a 
real matrix are real or complex conjugate in pairs. 
Inverse matrix. Show that A”? exists if and only if 
the eigenvalues Aj,---, A, are all nonzero, and then 
At has the eigenvalues 1/Ay,--+, 1/An. 

Transpose. Illustrate Theorem 3 with examples of your 
own. 


8.2 Some Applications of Eigenvalue Problems 


We have selected some typical examples from the wide range of applications of matrix 
eigenvalue problems. The last example, that is, Example 4, shows an application involving 
vibrating springs and ODEs. It falls into the domain of Chapter 4, which covers matrix 
eigenvalue problems related to ODE’s modeling mechanical systems and electrical 
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networks. Example 4 is included to keep our discussion independent of Chapter 4. 
(However, the reader not interested in ODEs may want to skip Example 4 without loss 
of continuity.) 


Stretching of an Elastic Membrane 


An elastic membrane in the x 1x 2-plane with boundary circle xe + x2 =1 (Fig. 160) is stretched so that a point 
P: (x1, Xg) goes over into the point Q: (y1, yg) given by 


V1 5 3]) x41 yy = 5x1 + 3x9 
(1) y= 3 in components, 
3 5|\| x2 yo = 3x, + 5x2. 


Find the principal directions, that is, the directions of the position vector x of P for which the direction of the 
position vector y of Q is the same or exactly opposite. What shape does the boundary circle take under this 
deformation? 


Solution. We are looking for vectors x such that y = Ax. Since y = Ax, this gives Ax = Ax, the equation 
of an eigenvalue problem. In components, Ax = Ax is 


xy a0 3x9 = Ax, (5 = A)xy + 3x9 =0 
(2) or 
3x1 + S5xq = Axo 3xy +(5-—A)xo = 0. 


The characteristic equation is 


5-d 3 
(3) 


=(5-A’-9=0. 
3 5-A 


Its solutions are Ay = 8 and Ag = 2. These are the eigenvalues of our problem. For A = A, = 8, our system (2) 
becomes 


—3x, + 3xgq = 0, | Solution x2 = x1, x1 arbitrary, 


3xy — 3xq = 0. for instance, x1 = x2 = 1. 


For Ag = 2, our system (2) becomes 


3x1 + 3xq = 0, Solution x5 = —x4, x, arbitrary, 
3xy + 3xg = 0. for instance, x; = 1, x2 = —1. 
We thus obtain as eigenvectors of A, for instance, [1 Wy corresponding to Ay and[l  — Wy corresponding to 


Ag (or a nonzero scalar multiple of these). These vectors make 45° and 135° angles with the positive x,-direction. 
They give the principal directions, the answer to our problem. The eigenvalues show that in the principal 
directions the membrane is stretched by factors 8 and 2, respectively; see Fig. 160. 

Accordingly, if we choose the principal directions as directions of a new Cartesian u1u2-coordinate system, 
say, with the positive u4-semi-axis in the first quadrant and the positive w2-semi-axis in the second quadrant of 
the x 1x9-system, and if we set uy = rcos d, ug = rsin d, then a boundary point of the unstretched circular 
membrane has coordinates cos , sin @. Hence, after the stretch we have 


Z1 = 8cos ¢, Zg = 2sing. 


Since cos” ht sin? ¢ = 1, this shows that the deformed boundary is an ellipse (Fig. 160) 


(4) —+—=1 a 
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EXAMPLE 2 


EXAMPLE 3 


Fig. 160. Undeformed and deformed membrane in Example 1 


Eigenvalue Problems Arising from Markov Processes 


Markov processes as considered in Example 13 of Sec. 7.2 lead to eigenvalue problems if we ask for the limit 
state of the process in which the state vector x is reproduced under the multiplication by the stochastic matrix 
A governing the process, that is, Ax = x. Hence A should have the eigenvalue 1, and x should be a corresponding 
eigenvector. This is of practical interest because it shows the long-term tendency of the development modeled 
by the process. 

In that example, 


[0.7 ol 0 [0.7 02 o1ffi] [1 
A =| 0.2 0.9 0.2 For the transpose, 0.1 0.9 0 1|}=}1 
| 0.1 0 0.8 | | 0 0.2 0.8} | 1 1 


Hence AT has the eigenvalue 1, and the same is true for A by Theorem 3 in Sec. 8.1. An eigenvector x of A 
for A = | is obtained from 


[03 oO1 0 -i «6 
A-I=/ 02 —0.1 0,2: ||. row-reduced to 0 —5 3 
0.1 0 -02| | 0 o of 
Taking x3 = 1, we get x2 = 6 from —x2/30 + x3/5 = 0 and then x, = 2 from —3x 4/10 + x2/10 = 0. This 


givesx =[2 6 1)". It means that in the long run, the ratio Commercial:Industrial:Residential will approach 
2:6:1, provided that the probabilities given by A remain (about) the same. (We switched to ordinary fractions 
to avoid rounding errors.) B 


Eigenvalue Problems Arising from Population Models. Leslie Model 


The Leslie model describes age-specified population growth, as follows. Let the oldest age attained by the 
females in some animal population be 9 years. Divide the population into three age classes of 3 years each. Let 
the “Leslie matrix” be 


0 23 04 
(5) L=[lxl=|0.6 0 0 
10 8603 0 | 


where /,, is the average number of daughters born to a single female during the time she is in age class k, and 
lj, ;-1U = 2, 3) is the fraction of females in age class j — | that will survive and pass into class j. (a) What is the 
number of females in each class after 3, 6, 9 years if each class initially consists of 400 females? (b) For what initial 
distribution will the number of females in each class change by the same proportion? What is this rate of change? 
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Solution. (a) Initially, x{y) = [400 400 400]. After 3 years, 


[0 2.3 04]{400] [1080] 
Xg) = Lx@ =|06 0 0 |} 400]=| 240}. 
| 0 0.3 0 |) 400 120 


Similarly, after 6 years the number of females in each class is given by x(6) = (Lx,3))" = [600 648 72], and 
after 9 years we have x(9) = (Lxi6))" = [1519.2 360 194.4]. 

(b) Proportional change means that we are looking for a distribution vector x such that Lx = Ax, where A is 
the rate of change (growth if A > 1, decrease if A < 1). The characteristic equation is (develop the characteristic 
determinant by the first column) 


det (L — AD d? — 0.6(—2.3A — 0.3 + 0.4) A? + 1.38A + 0.072 = 0. 


A positive root is found to be (for instance, by Newton’s method, Sec. 19.2) A = 1.2. A corresponding eigenvector 
x can be determined from the characteristic matrix 


/-12 23 «©04| fi | 
A -— 121 = 06 —12 0 |, say, x=]| 0.5 
0 03-12] 0.125 


where x3 = 0.125 is chosen, xg = 0.5 then follows from 0.3xg — 1.2x3 =0, and x, =1 from 

1.2x, + 2.3x9 + 0.4x3 = 0. To get an initial population of 1200 as before, we multiply x by 
1200/(1 + 0.5 + 0.125) = 738. Answer: Proportional growth of the numbers of females in the three classes 
will occur if the initial values are 738, 369, 92 in classes 1, 2, 3, respectively. The growth rate will be 1.2 per 
3 years. | 


Vibrating System of Two Masses on Two Springs (Fig. 161) 


Mass-spring systems involving several masses and springs can be treated as eigenvalue problems. For instance, 
the mechanical system in Fig. 161 is governed by the system of ODEs 


yt 3y1 — 2(y1 — ya) 5y1 + 2y2 
(6) 


yo 2(y2 — yr) 2y1 — 2ya 


where y; and yz are the displacements of the masses from rest, as shown in the figure, and primes denote 
derivatives with respect to time f. In vector form, this becomes 


yy =5 2\| yz 
(7) y” = P = Ay = ; 
y2 2 —-2\\ yo 
k,=3 
(y, = 0) m,=1 
1 1 | 
Vy a= 
ky=2 (Net change in 
spring length 
(yp = 0) m,= 1 =¥y-I,) 
J 
System in 
static System in 
equilibrium motion 


Fig. 161. Masses on springs in Example 4 
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We try a vector solution of the form 
(8) y = xe", 


This is suggested by a mechanical system of a single mass on a spring (Sec. 2.4), whose motion is given by 
exponential functions (and sines and cosines). Substitution into (7) gives 


w°xe’! = Axe”, 


Dividing by e“ and writing w” = A, we see that our mechanical system leads to the eigenvalue problem 


2 


(9) Ax = Ax where A = o*. 


From Example | in Sec. 8.1 we see that A has the eigenvalues A; = —1 and Ag = —6. Consequently, 
w = +V-1 = +iand V—6 = +iV6, respectively. Corresponding eigenvectors are 


(10) 


From (8) we thus obtain the four complex solutions [see (10), Sec. 2.2] 
xen = x;(cost + isin), 
xye7!V6t = xo(cos V6t + isin V6 0). 
By addition and subtraction (see Sec. 2.2) we get the four real solutions 
X1 COS ¢, x1 sin ¢, x2. cos V6 1, x2 sin V6 t. 
A general solution is obtained by taking a linear combination of these, 


y = x1 (a, cos t + by sin t) + xg (agcos V6t + by sin V6 t) 


with arbitrary constants a, by, dg, be (to which values can be assigned by prescribing initial displacement and 
initial velocity of each of the two masses). By (10), the components of y are 


yy = aycost + by sint + 2agcos V6t + 2bysin V6t 
yo = 2a, cost + 2b; sint — ag cos V6t — bysin V6 t. 


These functions describe harmonic oscillations of the two masses. Physically, this had to be expected because 
we have neglected damping. B 


PROBLEM SET 8.2 


1-6 


ELASTIC DEFORMATIONS 


Given A in a deformation y = Ax, find the principal 
directions and corresponding factors of extension or 


7-9| MARKOV PROCESSES 


Find the limit state of the Markov process modeled by the 
given matrix. Show the details. 


contraction. Show the details. 


: ; [0.2 0.5 

3.0 15 20 04 7. 

1. 2. [0.8 0.5 

15 3.0 04 2.0 

: . 04 03 03 06 O1 02 
7 V6 5 2 

3. 4. 8.103 06 O11] 9/04 O11 04 
V6 2 2 13 

i ‘ 03 O1 0.6 0 O8 04 
1 4 125 0.75 

5. 6. 

zl [0.75 1.25 
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10-12 


AGE-SPECIFIC POPULATION 


Find the growth rate in the Leslie model (see Example 3) 
with the matrix as given. Show the details. 


10. 


12. 


0 90 5.0 0 3.45 0.60 
04 0 0 | 11.]0.90 0 0 
| 0 04 0 | 0 045 0 
[0 30 20 20 

05 0 0 0 

0 05 0 0 

0 0 01 0 


13-15] LEONTIEF MODELS' 


13. 


14. 


15. 


Leontief input-output model. Suppose that three 
industries are interrelated so that their outputs are used 
as inputs by themselves, according to the 3 x 3 
consumption matrix 


lol 05 O 
A= [axl =|0.8 0 0.4 
10.1 05 0.6 


where aj, is the fraction of the output of industry k 
consumed (purchased) by industry j. Let p; be the price 
charged by industry j for its total output. A problem is 
to find prices so that for each industry, total 
expenditures equal total income. Show that this leads 
to Ap =p, where p=[p, ps psi’, and find a 
solution p with nonnegative p4, Po, p3- 

Show that a consumption matrix as considered in Prob. 
13 must have column sums 1 and always has the 
eigenvalue 1. 


Open Leontief input-output model. If not the whole 
output but only a portion of it is consumed by the 


industries themselves, then instead of Ax = x (as in Prob. 
13), we have x — Ax = y, where x = [x1 x2 x3" 
is produced, Ax is consumed by the industries, and, thus, 
y is the net production available for other consumers. 
Find for what production x a given demand vector 


y =[0.1 0.3  0.1]" can be achieved if the consump- 
tion matrix is 
[0.1 04 02 
A =| 0.5 0 0.1 
|. 0.1 0.4 0.4 


16-20 


GENERAL PROPERTIES OF EIGENVALUE 


PROBLEMS 


Let A = [aj] be an n X n matrix with (not necessarily 


distinct) eigenvalues A, -- 
16. 


17. 


18. 


19. 


20. 


*, An. Show. 
Trace. The sum of the main diagonal entries, called 
the trace of A, equals the sum of the eigenvalues of A. 


shift.” A—kI has the eigenvalues 
-, A, — k and the same eigenvectors as A. 


“Spectral 
Ay —k,:: 
Scalar multiples, powers. kA has the eigenvalues 


kdy,+++, KAy. A''(m = 1,2,--+) has the eigenvalues 
7’,-->, An’. The eigenvectors are those of A. 

Spectral mapping theorem. The “polynomial 

matrix” 

D(A) = ky A™ + Ky —1A™ 4 kA + kol 

has the eigenvalues 

PO) = hl” + hencad oe hay + ho 

where j = 1,---,m, and the same eigenvectors as A. 


Perron’s theorem. A Leslie matrix L with positive 
112, 113, 121, 139 has a positive eigenvalue. (This is a 
special case of the Perron—Frobenius theorem in Sec. 
20.7, which is difficult to prove in its general form.) 


8.3 Symmetric, Skew-Symmetric, 
and Orthogonal Matrices 


We consider three classes of real square matrices that, because of their remarkable 
properties, occur quite frequently in applications. The first two matrices have already been 
mentioned in Sec. 7.2. The goal of Sec. 8.3 is to show their remarkable properties. 


IWASSILY LEONTIEF (1906-1999). American economist at New York University. For his input-output 
analysis he was awarded the Nobel Prize in 1973. 
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DEFINITIONS 


EXAMPLE —1 


EXAMPLE 2 


THEOREM -1 


Symmetric, Skew-Symmetric, and Orthogonal Matrices 

A real square matrix A = [a;,] is called 
symmetric if transposition leaves it unchanged, 

(1) Al thus kj = Gr, 
skew-symmetric if transposition gives the negative of A, 

(2) A’ — = is\. thus aA = —~ak> 


orthogonal if transposition gives the inverse of A, 


(3) Aloe 


Symmetric, Skew-Symmetric, and Orthogonal Matrices 


The matrices 


[-3 1 5| [0 9 -12 [fe oe oe 
1 0 -2 —9 0 20 2 2 § 
| 5 -2 4 12 —20 0 7 4 a 


are symmetric, skew-symmetric, and orthogonal, respectively, as you should verify. Every skew-symmetric 
matrix has all main diagonal entries zero. (Can you prove this?) ia] 


Any real square matrix A may be written as the sum of a symmetric matrix R and a skew- 
symmetric matrix S, where 


(4) R=3(A+A‘) and S=3(A- A’). 


Illustration of Formula (4) 


9 5 2 90 3.5 3.5 0 15-15 
A=|2 3 -8/=R+S =| 35 3.0 2.0} +] -1.5 0 —6.0 |_| 
[5 4 3] [3.5 -2.0 3.0 1.5 60 0 | 


Eigenvalues of Symmetric and Skew-Symmetric Matrices 


(a) The eigenvalues of a symmetric matrix are real. 


(b) The eigenvalues of a skew-symmetric matrix are pure imaginary or zero. 


This basic theorem (and an extension of it) will be proved in Sec. 8.5. 
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EXAMPLE 3 


THEOREM 2 


PROOF 
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Eigenvalues of Symmetric and Skew-Symmetric Matrices 


The matrices in (1) and (7) of Sec. 8.2 are symmetric and have real eigenvalues. The skew-symmetric matrix 
in Example | has the eigenvalues 0, —25i, and 257. (Verify this.) The following matrix has the real eigenvalues 
1 and 5 but is not symmetric. Does this contradict Theorem 1? 


3 4 
a 
1 3 


Orthogonal Transformations and Orthogonal Matrices 


Orthogonal transformations are transformations 
(5) y = Ax where A is an orthogonal matrix. 


With each vector x in R” such a transformation assigns a vector y in R”. For instance, 
the plane rotation through an angle 0 


yy cos@ —sin@}} x1 


(6) [= = 
yo sin 8 cos 6 || x2 


is an orthogonal transformation. It can be shown that any orthogonal transformation in 
the plane or in three-dimensional space is a rotation (possibly combined with a reflection 
in a straight line or a plane, respectively). 

The main reason for the importance of orthogonal matrices is as follows. 


Invariance of Inner Product 


An orthogonal transformation preserves the value of the inner product of vectors 
a and b in R”, defined by 


(7) asb=alb=[aq, -*: al 


That is, for any a and b in R", orthogonal n X n matrix A, and u = Aa,v = Ab 
we haveuev=aerb. 

Hence the transformation also preserves the length or norm of any vector a in 
R” given by 


(8) jal = Vaea= Vala. 


Let A be orthogonal. Let u = Aa and v = Ab. We must show that u* v = a* b. Now 
(Aa)' = a'AT by (10d) in Sec. 7.2 and A'A = A1A = I by (3). Hence 


(9) uev = uly = (Aa)'Ab = a! A‘Ab = a'Ib = a'b = arb. 


From this the invariance of ||al| follows if we set b = a. i 
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THEOREM 3 


PROOF 


THEOREM 4 


PROOF 


EXAMPLE 4 


THEOREM 5 


Orthogonal matrices have further interesting properties as follows. 


Orthonormality of Column and Row Vectors 


A real square matrix is orthogonal if and only if its column vectors ay, +++, Aj, (and 
also its row vectors) form an orthonormal system, that is, 


0 if j#k 
(10) aj? a, = aja, = { 
1 if j=k. 
(a) Let A be orthogonal. Then A 7A = A‘A = L In terms of column vectors a4,°°', an, 
al aja; aja ‘t+ ajay 
Gl) IT=ATA=ATA=] © |fay---ay] = 
ay, aia; alas cs ala, 


The last equality implies (10), by the definition of the n X n unit matrix I. From (3) it 
follows that the inverse of an orthogonal matrix is orthogonal (see CAS Experiment 12). 
Now the column vectors of A-l(=A’) are the row vectors of A. Hence the row vectors 
of A also form an orthonormal system. 

(b) Conversely, if the column vectors of A satisfy (10), the off-diagonal entries in (11) 
must be 0 and the diagonal entries 1. Hence A'A = I, as (11) shows. Similarly, AAT = I. 
This implies Av = A? because also A-1A = AA7? = Land the inverse is unique. Hence 
A is orthogonal. Similarly when the row vectors of A form an orthonormal system, by 
what has been said at the end of part (a). 3] 


Determinant of an Orthogonal Matrix 


The determinant of an orthogonal matrix has the value +1 or —1. 


From det AB = det A det B (Sec. 7.8, Theorem 4) and det A’ = det A (Sec. 7.7, 
Theorem 2d), we get for an orthogonal matrix 


1 = det I = det(AA!) = det(AA') = det A det A’ = (det A)”. | 


Illustration of Theorems 3 and 4 


The last matrix in Example 1 and the matrix in (6) illustrate Theorems 3 and 4 because their determinants are 
—1 and +1, as you should verify. B 


Eigenvalues of an Orthogonal Matrix 


The eigenvalues of an orthogonal matrix A are real or complex conjugates in pairs 
and have absolute value |. 
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PROOF 


EXAMPLE 5 
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The first part of the statement holds for any real matrix A because its characteristic 
polynomial has real coefficients, so that its zeros (the eigenvalues of A) must be as 
indicated. The claim that |A] = 1 will be proved in Sec. 8.5. a 


Eigenvalues of an Orthogonal Matrix 


The orthogonal matrix in Example | has the characteristic equation 


3+ 27 +2A-1=0. 


Now one of the eigenvalues must be real (why?), hence +1 or —1. Trying, we find —1. Division by A + 1 
gives -(? - 5A/3 + 1) = 0 and the two eigenvalues (5 + iV11)/6 and (5 — iV/11)/6, which have absolute 
value 1. Verify all of this. a 


Looking back at this section, you will find that the numerous basic results it contains have 
relatively short, straightforward proofs. This is typical of large portions of matrix 


eigenvalue theory. 


PROBLEM—SET 8-3 


1-10 


SPECTRUM 


Are the following matrices symmetric, skew-symmetric, or 
orthogonal? Find the spectrum of each, thereby illustrating 
Theorems | and 5. Show your work in detail. 


1. 


- 


11. 


12. 


(b) Rotation. Show that (6) is an orthogonal trans- 
formation. Verify that it satisfies Theorem 3. Find the 
inverse transformation. 


(c) Powers. Write a program for computing powers 


0.8 06 a b A™(m = 1,2,---) of a 2 X 2 matrix A and their 
‘ ‘ 2. spectra. Apply it to the matrix in Prob. 1 (call it A). To 
—0.6 0.8 —b a what rotation does A correspond? Do the eigenvalues 
of A” have a limit as m—> ~? 
2 4 cone caine (d) Compute the eigenvalues of (0.9A)"", where A is 
—8 2 a f) cos 0 the matrix in Prob. 1. Plot them as points. What is their 
_ _ limit? Along what kind of curve do these points 
6 0 0 a k k approach the limit? 
0 2 2 6. | k ‘ k (e) Find A such that y = Ax is a counterclockwise 
rotation through 30° in the plane. 
|O —-2 5 Lk k a 
| 0 9 12 lia 0 0 13-20| GENERAL PROPERTIES 
—9 0 20 8.10 cos@ —sin@ 13. Verification. Verify the statements in Example 1. 
; 14. Verify the statements in Examples 3 and 4. 
12. —20 0 0 sin 0 cos 8 . 
~ ~ 15. Sum. Are the eigenvalues of A + B sums of the 
0 0 1 4 8 @ eigenvalues of A and of B? 
0 1 0 10. | —2 4 4 16. Orthogonality. Prove that eigenvectors of a symmetric 
. 2 7 2 matrix corresponding to different eigenvalues are 
=] ) ) —$ 3 g orthogonal. Give examples. 


WRITING PROJECT. Section Summary. Sum- 
marize the main concepts and facts in this section, 
giving illustrative examples of your own. 

CAS EXPERIMENT. Orthogonal Matrices. 

(a) Products. Inverse. Prove that the product of two 
orthogonal matrices is orthogonal, and so is the inverse 
of an orthogonal matrix. What does this mean in terms 
of rotations? 


17. 


18. 


19. 


20. 


Skew-symmetric matrix. Show that the inverse of a 
skew-symmetric matrix is skew-symmetric. 


Do there exist nonsingular skew-symmetric n X n 
matrices with odd n? 


Orthogonal matrix. Do there exist skew-symmetric 
orthogonal 3 X 3 matrices? 


Symmetric matrix. Do there exist nondiagonal 
symmetric 3 X 3 matrices that are orthogonal? 
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8.4 Eigenbases. Diagonalization. 
Quadratic Forms 


THEOREM -1 


PROOF 


So far we have emphasized properties of eigenvalues. We now turn to general properties 
of eigenvectors. Eigenvectors of ann X n matrix A may (or may not!) form a basis for 
R”. If we are interested in a transformation y = Ax, such an “eigenbasis” (basis of 
eigenvectors)—if it exists—is of great advantage because then we can represent any x in 
R” uniquely as a linear combination of the eigenvectors x1,°--, X,, Say, 


X = C1X1 + CoXg +++ + CyXn.- 


And, denoting the corresponding (not necessarily distinct) eigenvalues of the matrix A by 
A1,°**, An, we have Ax; = Ajx;, so that we simply obtain 


y = Ax = A(cyxy + +++ + CpXn) 


(1) = cy AXx} ai CnAXn, 


= CyA1X1 +++ + CyAnXn- 


This shows that we have decomposed the complicated action of A on an arbitrary vector 
x into a sum of simple actions (multiplication by scalars) on the eigenvectors of A. This 
is the point of an eigenbasis. 

Now if the n eigenvalues are all different, we do obtain a basis: 


Basis of Eigenvectors 


Ifann X n matrix A has n distinct eigenvalues, then A has a basis of eigenvectors 
X1,°°°, Xy for R”. 


All we have to show is that x1, ---, X, are linearly independent. Suppose they are not. Let 
r be the largest integer such that {x1,---,x,} is a linearly independent set. Then r <n 
and the set {X1,°-+, X;, X41} is linearly dependent. Thus there are scalars c1,°++, Crit, 
not all zero, such that 
(2) CyXy +++ + Cp 4 1Xr41 = 0 
(see Sec. 7.4). Multiplying both sides by A and using Ax; = A;x;, we obtain 
(3) A(cyXy + ort+ + Cpa X pga) = CA $+ + Crp ApgaXrt1 = AO = 0. 
To get rid of the last term, we subtract A, times (2) from this, obtaining 

CUAq = Apya)&1 Ft +H CpAp — Ap ea)Xr = 0. 
Here cy(Ay — Arii) = 0,°°+, C(Ap — Apis) = Osince {x1,+-+-,x,} is linearly independent. 
Hence cy = -:: =, = 0, since all the eigenvalues are distinct. But with this, (2) reduces to 


Cr41Xr+1 = 0, hence c+, = 0, since x,;+, # 0 (an eigenvector!). This contradicts the fact 
that not all scalars in (2) are zero. Hence the conclusion of the theorem must hold. 5] 
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EXAMPLE 1 


THEOREM 2 


EXAMPLE 2 


DEFINITION 


THEOREM 3 
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Eigenbasis. Nondistinct Eigenvalues. Nonexistence 


5 3 
The matrix A = has a basis of eigenvectors 


1 1 
| | | corresponding to the eigenvalues A, = 8, 


Ag = 2. (See Example | in Sec. 8.2.) 
Even if not all n eigenvalues are different, a matrix A may still provide an eigenbasis for R”. See Example 2 
in Sec. 8.1, where n = 3. 
On the other hand, A may not have enough linearly independent eigenvectors to make up a basis. For 
instance, A in Example 3 of Sec. 8.1 is 
0 1 k 


A= and has only one eigenvector 


| (k # 0, arbitrary). |_| 


0 0 0 


Actually, eigenbases exist under much more general conditions than those in Theorem 1. 
An important case is the following. 


Symmetric Matrices 


A symmetric matrix has an orthonormal basis of eigenvectors for R”. 


For a proof (which is involved) see Ref. [B3], vol. 1, pp. 270-272. 


Orthonormal Basis of Eigenvectors 


The first matrix in Example 1 is symmetric, and an orthonormal basis of eigenvectors is [1/V2 1/V2]", 


H/Vv2 -1/v2]". 


Similarity of Matrices. Diagonalization 


Eigenbases also play a role in reducing a matrix A to a diagonal matrix whose entries are 
the eigenvalues of A. This is done by a “similarity transformation,” which is defined as 
follows (and will have various applications in numerics in Chap. 20). 


Similar Matrices. Similarity Transformation 


Ann X n matrix A is called similar to an n X n matrix A if 
(4) A=P7'AP 


for some (nonsingular!) n X n matrix P. This transformation, which gives A from 
A, is called a similarity transformation. 


The key property of this transformation is that it preserves the eigenvalues of A: 


Eigenvalues and Eigenvectors of Similar Matrices 


IfA is similar to A, then A has the same eigenvalues as A. 
Furthermore, if x is an eigenvector of A, then y = P~'x is an eigenvector of A 
corresponding to the same eigenvalue. 
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PROOF 


EXAMPLE 3 


THEOREM 4 


From Ax = Ax (A an eigenvalue, x # 0) we get P~1Ax = AP~!x. Now I = PP™!. By 
this identity trick the equation P~'Ax = AP~ ‘x gives 


P-tAx = P“AIx = P~1APP7!x = (P~1AP)P7 tx = A(P71x) = APT x. 


Hence A is an eigenvalue of A and P7'xa corresponding eigenvector. Indeed, Pix #0 
because P~'x = 0 would give x = Ix = PP~!x = PO = 0, contradicting x # 0. a 
Eigenvalues and Vectors of Similar Matrices 


6 —3 
Let, A= 


| and P= 


Then A= 


Here P+ was obtained from (4*) in Sec. 7.8 with det P = 1. We see that A has the eigenvalues Ay = 3, Ag = 2. 
The characteristic equation of A is (6 — A)(—1 — A) + 12 2 — 5A + 6 = 0. Ithas the roots (the eigenvalues 
of A) Ay = 3, Ag = 2, confirming the first part of Theorem 3. 


We confirm the second part. From the first component of (A — ADx = 0 we have (6 — A)x, — 3x 
A = 3 this gives 3x, — 3x2 = 0, say,x, =[1 1)". ForA = 2 it gives 4x, — 3x2 = 0, say, xo = [ 
Theorem 3 we thus have 


ere rE 


Indeed, these are eigenvectors of the diagonal matrix A. 
Perhaps we see that x; and xg are the columns of P. This suggests the general method of transforming a 
matrix A to diagonal form D by using P = X, the matrix with eigenvectors as columns. 3] 


2 = 0. For 
3 4)". In 


By a suitable similarity transformation we can now transform a matrix A to a diagonal 
matrix D whose diagonal entries are the eigenvalues of A: 


Diagonalization of a Matrix 


If ann X n matrix A has a basis of eigenvectors, then 


(5) D =x" tAx 


is diagonal, with the eigenvalues of A as the entries on the main diagonal. Here X 
is the matrix with these eigenvectors as column vectors. Also, 


(5*) Dp” = xt a”’x (m = 2, 3,°°*). 
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PROOF 


EXAMPLE 4 
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Let x1,+--,X,, be a basis of eigenvectors of A for R”. Let the corresponding eigenvalues 
of A be Ay,°°*,An, respectively, so that Ax, = A,xj,:--, AX, = A,X,. Then 
X = [x,-°- x,,] has rank n, by Theorem 3 in Sec. 7.4. Hence X7? exists by Theorem | 
in Sec. 7.8. We claim that 


(6) Ax = A[xy ++: X,] = [Ax, °°: Axy,] = [AixX, -** AyX,] = XD 


where D is the diagonal matrix as in (5). The fourth equality in (6) follows by direct 
calculation. (Try it for n = 2 and then for general n.) The third equality uses Ax, = A;Xx. 
The second equality results if we note that the first column of AX is A times the first 
column of X, which is x, and so on. For instance, when n = 2 and we write 


X1 = [%11 X21], X2 = [X12 X22], we have 


441 42) |%11 *12 
AX = A[x1 x2] = 
[421 422) |*21 %*22 


441X141 + Ay2X21 411X142 + Ay2X22 
= _ [Ax, Axs |. 
| d2iX11 + de2X01 dg1X12 + de2Xe0 
Column | Column 2 


If we multiply (6) by X7! from the left, we obtain (5). Since (5) is a similarity 
transformation, Theorem 3 implies that D has the same eigenvalues as A. Equation (5*) 
follows if we note that 


D? = DD = (X_!AX)(X~!AX) = X-1A(KX7 AX = X"1AAX = X71A2X, etc. 


Diagonalization 


Diagonalize 


A =| -115 1.0 5.5: |. 


177, 18 9.3 | 


Solution. The characteristic determinant gives the characteristic equation —A® — A? + 12A = 0. The roots 
(eigenvalues of A) are Ay = 3, Ag = —4, Ag = 0. By the Gauss elimination applied to (A — ADx = 0 with 
A = Aj, Ag, Ag we find eigenvectors and then x7! by the Gauss—Jordan elimination (Sec. 7.8, Example 1). The 


results are 


f-1] [| a] [2] f-1 1 2] [-0.7 02 0.31] 
3], |}-1], Jl], X=] 3 -1 1], x t=/-13 -02 0o7]. 
= 3 4 |-1 3) | | 08 02 -0.2] 


[-07 02 03]/-3 -4 0 3 0 O 


D =x 1Ax 13-02 07 9 4 O]=/0 -4 oO}. a] 


| 0.8 0.2 —0:2.)|.-3.° -12 0 0 0 0 
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Quadratic Forms. Transformation to Principal Axes 


By definition, a quadratic form Q in the components x1,°--,x, of a vector x is a sum 
of n? terms, namely, 


n n 
Q = x! Ax = > » AkXjXk 


j=1k=1 
= 2 
= a44X1 ste Q12X 1X2 ap oon a AAnX1Xn 
(7) ain CoN ale Ge SF 22° PF ibyaione. 
TE er eh oe hn ee 
2 
ar Ghacraii Im Chypnenyesy sir C2 IP Chava Gan 


A = [aj] is called the coefficient matrix of the form. We may assume that A is 
symmetric, because we can take off-diagonal terms together in pairs and write the result 
as a sum of two equal terms; see the following example. 


Quadratic Form. Symmetric Coefficient Matrix 


Let 


3 4 x1 
x'Ax = [x1 sl | | 3x2 + 4xyxo + 6xoxy + 2x2 = 3x? + 10xyxo + 2x3. 
6 2 x2 


Here 4 + 6 = 10 = 5 + 5. From the corresponding symmetric matrix C = [cj,], where cj, = 3 (ain + apj)s 
thus cyy = 3, cyg = co, = 5, cog = 2, we get the same result; indeed, 


3 S| X1 
x'Cx = [x1 sl | | 3x2 + Sxyxo + Sxqxy + 2x2 = 3x? + 10xyxo + 2x3, 8 
5 2] | xe 


Quadratic forms occur in physics and geometry, for instance, in connection with conic 
sections (ellipses x4/a" + x3/b? = 1, etc.) and quadratic surfaces (cones, etc.). Their 
transformation to principal axes is an important practical task related to the diagonalization 
of matrices, as follows. 

By Theorem 2, the symmetric coefficient matrix A of (7) has an orthonormal basis of 
eigenvectors. Hence if we take these as column vectors, we obtain a matrix X that is 
orthogonal, so that x71 = x". From (5) we thus have A = XDX~! = XDX". Substitution 
into (7) gives 


(8) O = x'XDXx'x. 
If we set X'x = y, then, since x'= x, we have X71x = y and thus obtain 
(9) x = Xy. 


Furthermore, in (8) we have x'X = (X'x)' = y’ and X'x = y, so that Q becomes simply 


(10) OQ = y "Dy = Ay + Agys +--+ Anya. 
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EXAMPLE 6 
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This proves the following basic theorem. 


Principal Axes Theorem 


The substitution (9) transforms a quadratic form 


n n 
Q = x Ax = > > AjkX 7X k (Any = Dj) 
j=l1k=1 


to the principal axes form or canonical form (10), where Ay,-°--, Ay are the (not 
necessarily distinct) eigenvalues of the (symmetric!) matrix A, and X is an 
orthogonal matrix with corresponding eigenvectors X,,°**,Xp, respectively, as 
column vectors. 


Transformation to Principal Axes. Conic Sections 


Find out what type of conic section the following quadratic form represents and transform it to principal axes: 


O = 17x74 — 30xyx9 + 17x83 = 128. 


17. -15 Xy 
; x= : 
=15 17 Xo 
This gives the characteristic equation (17 — d)? — 157 = 0. It has the roots Ay = 2, Az = 32. Hence (10) 
becomes 


Solution. We have Q= x" Ax, where 


O = 2y} + 32y5. 
We see that Q = 128 represents the ellipse 2y? + 32y3 = 128, that is, 


yi ye 
oe 


If we want to know the direction of the principal axes in the x 1x 2-coordinates, we have to determine normalized 
eigenvectors from (A — AI)x = 0 with A = Ay = 2 and A = Ag = 32 and then use (9). We get 


ies a 
and ‘ 


1/Vv2 1/V2 


hence 


=Xy= 
ON Tava yi xa = ya/'V2 + yo/'V2. 


v2 


1/Vv2 as a xy = yn/V2 — yo/V2 


This is a 45° rotation. Our results agree with those in Sec. 8.2, Example 1, except for the notations. See also 
Fig. 160 in that example. le 
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PROBLEM SET 8.4 


1-5 


SIMILAR MATRICES HAVE EQUAL 
EIGENVALUES 


Verify this for A and A = P~1AP. If y is an eigenvector 
of P, show that x = Py are eigenvectors of A. Show the 


details of your work. 


8. Orthonormal basis. Illustrate Theorem 2 with further 
examples. 


9-16 


DIAGONALIZATION OF MATRICES 


[34] [—4 2 
A= . P= 
[4-3] | 3-1 
‘1 0] (ae = 
A= , P= 
[2-1] [10-7 
lg —4] | 0.28 0.96 
A= , P= 
[22] |-0.96 0.28 
lo o- 3 [2 o 3 
-A=|0 3 2], P=/0o 1 O|, 
I} 0 1 es O° 3 
A, =3 
ls 2 WG lo. -* w 
A=| 3 4 -9], P=l1 0 0 
a6. i 45 lo oO 1 


6. PROJECT. Similarity of Matrices. Similarity is 
basic, for instance, in designing numeric methods. 

(a) Trace. By definition, the trace of ann X n matrix 
A = [ax] is the sum of the diagonal entries, 


d292 pees at 


trace A = aq, + Gie. 


Show that the trace equals the sum of the eigenvalues, 
each counted as often as its algebraic multiplicity 
indicates. Illustrate this with the matrices A in Probs. 
1, 3, and 5. 

(b) Trace of product. Let B = [bj] ben X n. Show 
that similar matrices have equal traces, by first proving 


n n 
trace AB = ey Dd ahi = trace BA. 
i=11=1 


(c) Find a relationship between A in (4) and 
A = PAP“. 

(d) Diagonalization. What can you do in (5) if you 
want to change the order of the eigenvalues in D, for 
instance, interchange dy, = A, and dog = Ag? 

- No basis. Find further 2 X 2 and 3 X 3 matrices 
without eigenbasis. 


Find an eigenbasis (a basis of eigenvectors) and diagonalize. 


Show the details. 


=19 a 
11. 
—42 16 


4 0 0 
13.}12 -2 0 
[21 -6 1 
| -5 =6 6 
14.) -9 -8 12], A=-2 
}-12 -12 16 
[4 33 
5.)/3 6 I], A=10 
13 1 6 
[1 1 0 
16/1 1 0 
10 0 -4 
17-23| PRINCIPAL AXES. CONIC SECTIONS 


What kind of conic section (or pair of straight lines) is given 
by the quadratic form? Transform it to principal axes. 


Express x’ = [xy 


xg] in terms of the new coordinate 


vector y =[y, ye], as in Example 6. 
17. 7x? + 6xyxq + 7x3 = 200 

18. 3x7 + 8xyxq — 3x3 = 10 

19. 3x? + 22xyx9 + 3x3 =0 

20. 9x} + 6xpxe + x3 = 10 

21. xt — 12xyx2 + x3 = 70 

22. 4x7 + 12xyx2 + 13x3 = 16 

23. —11x? + 84xyx2 + 24x} = 156 
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24. Definiteness. A quadratic form Q(x) = x Ax and its 


25. 


8.5 Complex Matrices and Forms. 


(symmetric!) matrix A are called (a) positive definite 
if Q(x) > 0 for all x # 0, (b) negative definite if 
Q(x) < 0 for all x # 0, (c) indefinite if Q(x) takes 
both positive and negative values. (See Fig. 162.) 
[ Q(x) and A are called positive semidefinite (negative 
semidefinite) if Q(x) 2 0 (Q(x) S 0) for all x.] Show 
that a necessary and sufficient condition for (a), (b), 
and (c) is that the eigenvalues of A are (a) all positive, 
(b) all negative, and (c) both positive and negative. 
Hint. Use Theorem 5. 


Definiteness. A necessary and sufficient condition for 
positive definiteness of a quadratic form Q(x) = x'Ax 
with symmetric matrix A is that all the principal minors 
are positive (see Ref. [B3], vol. 1, p. 306), that is, 


a1 a2, 
a1 > 0, > 0, 
a2 a22, 
a1 a2 413 
a2 a22 d23| > 0, ase det A > 0. 
413 a23 433, 


Show that the form in Prob. 22 is positive definite, 
whereas that in Prob. 23 is indefinite. 
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x2 


(a) Positive definite form 


SSS SSSI 
SSSOSSSS 
Ay 


SS > 
SSS 
SSX SQ 


(c) Indefinite form 


Fig. 162. Quadratic forms in two variables (Problem 24) 


Optional 


The three classes of matrices in Sec. 8.3 have complex counterparts which are of practical 
interest in certain applications, for instance, in quantum mechanics. This is mainly because 
of their spectra as shown in Theorem | in this section. The second topic is about extending 
quadratic forms of Sec. 8.4 to complex numbers. (The reader who wants to brush up on 
complex numbers may want to consult Sec. 13.1.) 


Notations 


A = [Gx] is obtained from A = [ajx] by replacing each entry aj, = a + iB 


(a, B real) with its complex conjugate aj, = a — iB. Also, A 
of A, hence the conjugate transpose of A. 


= [aj] is the transpose 


EXAMPLE 1_ Notations 
34+ 4i La _ 3 -4i 
IfA = , then A= 
6 2-5 6 
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DEFINITION 


EXAMPLE 2 


Hermitian, Skew-Hermitian, and Unitary Matrices 


A square matrix A = [a,j] is called 


Hermitian if A” = A, that is, anj = Ax, 
skew-Hermitian if A! = —A, that is, nj = — Ax 
unitary if A’ =A, 


The first two classes are named after Hermite (see footnote 13 in Problem Set 5.8). 

From the definitions we see the following. If A is Hermitian, the entries on the main 
diagonal must satisfy aj; = a,j; that is, they are real. Similarly, if A is skew-Hermitian, 
then aj; = —aj;. If we set aj; = a + iP, this becomes a — iB = —(a@ + if). Hence a = 0, 
so that a;; must be pure imaginary or 0. 


Hermitian, Skew-Hermitian, and Unitary Matrices 


4 1 — 3i 3i 2Q+i xi 3V3 
A= B= = 
1 + 3i 7 -2+i -i 3V3 hi 

are Hermitian, skew-Hermitian, and unitary matrices, respectively, as you may verify by using the definitions. Ml 
If a Hermitian matrix is real, then A’ =A" =A. Hence a real Hermitian matrix is a 
symmetric matrix (Sec. 8.3). 

Similarly, if a skew-Hermitian matrix is real, then A’ = A’ = —A. Hence areal skew- 
Hermitian matrix is a skew-symmetric matrix. 

Finally, if a unitary matrix is real, then A’ =A' = A7!. Hence a real unitary matrix 


is an orthogonal matrix. 
This shows that Hermitian, skew-Hermitian, and unitary matrices generalize symmetric, 
skew-symmetric, and orthogonal matrices, respectively. 


Eigenvalues 


It is quite remarkable that the matrices under consideration have spectra (sets of eigenvalues; 
see Sec. 8.1) that can be characterized in a general way as follows (see Fig. 163). 


ImA - iti Ns i 
Pa Skew-Hermitian (skew-symmetric) 


Unitary (orthogonal) 


{ Hermitian (symmetric) 


1 Red 


Fig. 163. Location of the eigenvalues of Hermitian, skew-Hermitian, 
and unitary matrices in the complex A-plane 
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THEOREM 1 


EXAMPLE 3 


PROOF 
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Eigenvalues 
(a) The eigenvalues of a Hermitian matrix (and thus of a symmetric matrix) 
are real. 


(b) The eigenvalues of a skew-Hermitian matrix (and thus of a skew-symmetric 
matrix) are pure imaginary or zero. 


(c) The eigenvalues of a unitary matrix (and thus of an orthogonal matrix) have 
absolute value 1. 


Illustration of Theorem 1 


For the matrices in Example 2 we find by direct calculation 


Matrix Characteristic Equation Eigenvalues 
A Hermitian > — 11A + 18 =0 9, 2 
B — Skew-Hermitian 7 - 211+ 8 =0 4i, —2i 
C Unitary M-iA-1=0 $V3+4i, -3V3 + hi 
and |+3-V3 + $i)? =2 +2 =1. ial 


We prove Theorem 1. Let A be an eigenvalue and x an eigenvector of A. Multiply Ax = Ax 
from the left by x', thus x'Ax = Ax'x, and divide by xx =x wXy ttt $F XnXy = 
|x|? ofp. Sat. ae lxnl7, which is real and not O because x # 0. This gives 


(1) te x" Ax 
xx 


(a) If A is Hermitian, A‘ = AorA! = A and we show that then the numerator in (1) 
is real, which makes A real. x' Ax is a scalar; hence taking the transpose has no effect. Thus 


(2) x" Ax = (x'Ax)! = x"ATx = x! Ax = (xTAX). 
Hence, x'Ax equals its complex conjugate, so that it must be real. (a + ib = a — ib 
implies b = 0.) _ 
(b) If A is skew-Hermitian, A' = —A and instead of (2) we obtain 
(3) x'Ax = —(xTAx) 
so that x'Ax equals minus its complex conjugate and is pure imaginary or 0. 


(a + ib = —(a — ib) implies a = 0.) 
(c) Let A be unitary. We take Ax = Ax and its conjugate transpose 


(Ax)! = (Ax)! = Ax! 
and multiply the two left sides and the two right sides, 


(Ax)'Ax = AAX'x = |A|?x'x. 
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THEOREM 2 


PROOF 


DEFINITION 


THEOREM 3 


But A is unitary, A’ = A. so that on the left we obtain 
(Ax)"Ax = x'A ‘Ax = x'AJAx = X Ix = xx, 


Together, x'x= |A|?x"x. We now divide by x'x (40) to get lal? = 1. Hence |A| = 1. 
This proves Theorem | as well as Theorems | and 5 in Sec. 8.3. a 


Key properties of orthogonal matrices (invariance of the inner product, orthonormality of 
rows and columns; see Sec. 8.3) generalize to unitary matrices in a remarkable way. 

To see this, instead of R” we now use the complex vector space C” of all complex 
vectors with n complex numbers as components, and complex numbers as scalars. For 
such complex vectors the inner product is defined by (note the overbar for the complex 
conjugate) 


(4) aeb=a'b. 


The length or norm of such a complex vector is a real number defined by 


(5) l|al| = Vaea = Vala = Vay + -++ + Gpdy = Viay/? tieee $ | gyal". 


Invariance of Inner Product 


A unitary transformation, that is, y = Ax with a unitary matrix A, preserves the 
value of the inner product (4), hence also the norm (5). 


The proof is the same as that of Theorem 2 in Sec. 8.3, which the theorem generalizes. 
In the analog of (9), Sec. 8.3, we now have bars, 


uv =u'v = (Aa)'Ab = a'A ‘Ab = a'Ib = ab = arb. 


The complex analog of an orthonormal system of real vectors (see Sec. 8.3) is defined as 
follows. 


Unitary System 
A unitary system is a set of complex vectors satisfying the relationships 
0 if j#k 
(6) aj° aR = aj aK = 
1 if jr. 


Theorem 3 in Sec. 8.3 extends to complex as follows. 


Unitary Systems of Column and Row Vectors 


A complex square matrix is unitary if and only if its column vectors (and also its 
row vectors) form a unitary system. 
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PROOF 


THEOREM 4 


PROOF 


EXAMPLE 4 


THEOREM 5 


EXAMPLE 5 
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The proof is the same as that of Theorem 3 in Sec. 8.3, except for the bars required in 
A’ = A7! and in (4) and (6) of the present section. 


Determinant of a Unitary Matrix 


Let A be a unitary matrix. Then its determinant has absolute value one, that is, 
|det A] = 1. 


Similarly, as in Sec. 8.3, we obtain 


1 = det (AA~*) = det (AA) = det A det A’ = det A det A 
= det A det A = |det A|?. 


Hence |det A| = 1 (where det A may now be complex). o 


Unitary Matrix Illustrating Theorems 1c and 2—4 


For the vectors a’ = [2 ijandb'=[1+i 4i]wegeta’=[2 j]'anda'b = 2(1 +i) —4 24+ 2i 
and with 
0.87 0.6 i —0.8 + 3.2i 
= also Aa= and Ab = ‘ 
0.6 0.87 2 —2.6 + 0.6i 
as one can readily verify. This gives (Aa)'Ab = —2 + 2i, illustrating Theorem 2. The matrix is unitary. Its 


columns form a unitary system, 


ala, = —0.8i- 0.8i + 0.67 = 1, alas = —0.8i- 0.6 + 0.6 - 0.81 = 0, 


ajay = 0.67 + (—0.81)0.8i = 1 
and so do its rows. Also, det A = —1. The eigenvalues are 0.6 + 0.8iand —0.6 + 0.83, with eigenvectors [| 1 iy" 


and[1 —1]', respectively. ts 


Theorem 2 in Sec. 8.4 on the existence of an eigenbasis extends to complex matrices as 
follows. 


Basis of Eigenvectors 


A Hermitian, skew-Hermitian, or unitary matrix has a basis of eigenvectors for C” 
that is a unitary system. 


For a proof see Ref. [B3], vol. 1, pp. 270-272 and p. 244 (Definition 2). 


Unitary Eigenbases 


The matrices A, B, C in Example 2 have the following unitary systems of eigenvectors, as you should verify. 


1 
A: ——[1-3i 5]' (A=9), 1-31 -2]' (A=2) 
Vi ] [ ] 


1 
V 14 


3 
: [1-2i -5]" (A 2i) ie 142i)" (A= 4i) 
V30 V30 


c 1 i a=kG+ v3, 


WA lt 1’ A=30- V3). a 
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Hermitian and Skew-Hermitian Forms 


The concept of a quadratic form (Sec. 8.4) can be extended to complex. We call the 
numerator x'Ax in (1) a form in the components x4,°°:,X, of x, which may now be 
complex. This form is again a sum of n” terms 


n mn 
x'Ax = y > Aj XjX kc 
j=l k=1 


44X4X4 tere +t AynX1Xy 


(7) + dg4X9X4 tere Ht daynXoXn 
+ Can a ant at at Ga an Gat Meet RCS tC Re ee emt ae a 
+ dyitnd i ee + aden 


A is called its coefficient matrix. The form is called a Hermitian or skew-Hermitian 
form if A is Hermitian or skew-Hermitian, respectively. The value of a Hermitian form 
is real, and that of a skew-Hermitian form is pure imaginary or zero. This can be seen 
directly from (2) and (3) and accounts for the importance of these forms in physics. Note 
that (2) and (3) are valid for any vectors because, in the proof of (2) and (3), we did not 
use that x is an eigenvector but only that x'x is real and not 0. 


EXAMPLE 6 _ Hermitian Form 


For A in Example 2 and, say, x = [1 + i 


x'Ax=[1—i —5Si] 


1+ 3i 7 


4 l= 31 


5i]' we get 
1+i A(1 + i) + (1 — 3i)- Si 
=[1-i —-Si =223. Hf 
Si (1+ 31 + i) + 7° Si 


Clearly, if A and x in (4) are real, then (7) reduces to a quadratic form, as discussed in 


the last section. 


PROBLEM SET 8-5 


1-6| EIGENVALUES AND VECTORS 


Is the given matrix Hermitian? Skew-Hermitian? Unitary? 
Find its eigenvalues and eigenvectors. 


6 i i 1t+i 

1. 2 

= 6 -1+i 0 

i iV3 0 i 
3. 4. 

ive 4 iO 

li O O . 0 242% 0 
5/0 Oj 6.2% 6. 249) 

MO 2G | 0 2-2 0 


7. Pauli spin matrices. Find the eigenvalues and eigen- 
vectors of the so-called Pauli spin matrices and show 
that S,S, = iS,, SyS, = —iS,, S2 = 82 = 8S? =1, 
where 


8. Eigenvectors. Find eigenvectors of A, B, C in 
Examples 2 and 3. 
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9-12 


COMPLEX FORMS 


Is the matrix A Hermitian or skew-Hermitian? Find x'Ax. 
Show the details. 


4 3 - 2i —4i 
9 A= , x= 
34+ 2i —4 2+ 2i 
i —2+ 3i 2i 
10. A = ; x= 
2+ 3i 0 8 
PP 4 i 24) | 1 
11. A = =] 0) 3i |, x= i 
|-2+i 3h i | -i 
. 1 oa 4 . 1 
12. A =| -i 3 0}, x= i 
| 4 0 2 | -i 
13-20| GENERAL PROBLEMS 
13. Product. Show that (ABC)! = —C!BA for any 


n X n Hermitian A, skew-Hermitian B, and unitary C. 


14. 


16. 


17. 


18. 


20. 


Product. Show (BA)! = —AB for A and B in 
Example 2. For any n Xn Hermitian A and 
skew-Hermitian B. 


. Decomposition. Show that any square matrix may be 


written as the sum of a Hermitian and a skew-Hermitian 
matrix. Give examples. 


Unitary matrices. Prove that the product of two 
unitary n X n matrices and the inverse of a unitary 
matrix are unitary. Give examples. 


Powers of unitary matrices in applications may 
sometimes be very simple. Show that C!? =I in 
Example 2. Find further examples. 


Normal matrix. This important concept denotes a 
matrix that commutes with its conjugate transpose, 
AA! = A'A. Prove that Hermitian, skew-Hermitian, 
and unitary matrices are normal. Give corresponding 
examples of your own. 


. Normality criterion. Prove that A is normal if and 


only if the Hermitian and skew-Hermitian matrices in 
Prob. 18 commute. 

Find a simple matrix that is not normal. Find a normal 
matrix that is not Hermitian, skew-Hermitian, or 
unitary. 


CHAPTER—-8 REVIEW QUESTIONS AND PROBLEMS 


iF 


10. 


In solving an eigenvalue problem, what is given and 
what is sought? 


. Give a few typical applications of eigenvalue problems. 
. Do there exist square matrices without eigenvalues? 


. Can a real matrix have complex eigenvalues? Can a 


complex matrix have real eigenvalues? 


. Does a5 X 5 matrix always have a real eigenvalue? 
. What is algebraic multiplicity of an eigenvalue? Defect? 


. What is an eigenbasis? When does it exist? Why is it 


important? 


. When can we expect orthogonal eigenvectors? 


. State the definitions and main properties of the three 


classes of real matrices and of complex matrices that 
we have discussed. 


What is diagonalization? Transformation to principal axes? 


11-15 


SPECTRUM 


Find the eigenvalues. Find the eigenvectors. 


11. 


2.5 0.5 =} 4 
12. 
0.5 25 =12 a 


[Ah ot 
4.) 2 7 1 
I-11 85 
[Oo —3 =5 
5.1/3 0 -6 
16 6 0 
16-17| SIMILARITY 


Verify that A and A= p ‘AP have the same spectrum. 


19 12 2 4 
16. A = . Pe 
12 1 4 2 
7 4 . % 
17.A= . P= 
mg 7 3. °«5 
[-4 6 6 ‘it 2 =9 


18. A=] 0 2 0], 
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DIAGONALIZATION 22-25| CONIC SECTIONS. PRINCIPAL AXES 
Find an eigenbasis and diagonalize. Transform to canonical form (to principal axes). Express 
74 ia a [x1 x2]" in terms of the new variables [y, ya]. 

9 20 22. 9x3 — Oxyxq + 17x53 = 36 
-10 14 —56 513 


23. 4x? + 24xyx9 — 14x9 = 20 
24, 5x7 + 24xyx. — 5x2 =0 


—12 22 6 


25. 3.7x + 3.2xyx9 + 1.3x3 = 45 


SUMMARY—OF CHAPTER 8 


Linear Algebra: Matrix Eigenvalue Problems 


The practical importance of matrix eigenvalue problems can hardly be overrated. 
The problems are defined by the vector equation 


(1) Ax = Xx. 


A is a given square matrix. All matrices in this chapter are square. X is a scalar. To 
solve the problem (1) means to determine values of A, called eigenvalues (or 
characteristic values) of A, such that (1) has a nontrivial solution x (that is, x # 0), 
called an eigenvector of A corresponding to that A. An n X n matrix has at least 
one and at most n numerically different eigenvalues. These are the solutions of the 
characteristic equation (Sec. 8.1) 


a4, —A a2 ue An, 
21 dgg—A don 
(2) D(A) = det (A — AD) = = 0. 
Ani an2 + Any, — 2 


D(A) is called the characteristic determinant of A. By expanding it we get the 
characteristic polynomial of A, which is of degree n in A. Some typical applications 
are shown in Sec. 8.2. 

Section 8.3 is devoted to eigenvalue problems for symmetric (A’ = A), skew- 
symmetric (A' = —A), and orthogonal matrices (A' = Aq}. Section 8.4 
concerns the diagonalization of matrices and the transformation of quadratic forms 
to principal axes and its relation to eigenvalues. 

Section 8.5 extends Sec. 8.3 to the complex analogs of those real matrices, called 
Hermitian (A' = A), skew-Hermitian (A' = —A), and unitary matrices 
(A' = A’) All the eigenvalues of a Hermitian matrix (and a symmetric one) are 
real. For a skew-Hermitian (and a skew-symmetric) matrix they are pure imaginary 
or zero. For a unitary (and an orthogonal) matrix they have absolute value 1. 


CHAPTER 9 


Vector Differential Calculus. 
Grad, Div, Curl 


Engineering, physics, and computer sciences, in general, but particularly solid mechanics, 
aerodynamics, aeronautics, fluid flow, heat flow, electrostatics, quantum physics, laser 
technology, robotics as well as other areas have applications that require an understanding 
of vector calculus. This field encompasses vector differential calculus and vector integral 
calculus. Indeed, the engineer, physicist, and mathematician need a good grounding in 
these areas as provided by the carefully chosen material of Chaps. 9 and 10. 

Forces, velocities, and various other quantities may be thought of as vectors. Vectors 
appear frequently in the applications above and also in the biological and social sciences, 
so it is natural that problems are modeled in 3-space. This is the space of three dimensions 
with the usual measurement of distance, as given by the Pythagorean theorem. Within that 
realm, 2-space (the plane) is a special case. Working in 3-space requires that we extend 
the common differential calculus to vector differential calculus, that is, the calculus that 
deals with vector functions and vector fields and is explained in this chapter. 

Chapter 9 is arranged in three groups of sections. Sections 9.1—-9.3 extend the basic 
algebraic operations of vectors into 3-space. These operations include the inner product 
and the cross product. Sections 9.4 and 9.5 form the heart of vector differential calculus. 
Finally, Secs. 9.7—9.9 discuss three physically important concepts related to scalar and 
vector fields: gradient (Sec. 9.7), divergence (Sec. 9.8), and curl (Sec. 9.9). They are 
expressed in Cartesian coordinates in this chapter and, if desired, expressed in curvilinear 
coordinates in a short section in App. A3.4. 

We shall keep this chapter independent of Chaps. 7 and 8. Our present approach is in 
harmony with Chap. 7, with the restriction to two and three dimensions providing for a 
richer theory with basic physical, engineering, and geometric applications. 


Prerequisite: Elementary use of second- and third-order determinants in Sec. 9.3. 
Sections that may be omitted in a shorter course: 9.5, 9.6. 
References and Answers to Problems: App. | Part B, App. 2. 


9.1 Vectors in 2-Space and 3-Space 
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In engineering, physics, mathematics, and other areas we encounter two kinds of quantities. 
They are scalars and vectors. 

A scalar is a quantity that is determined by its magnitude. It takes on a numerical value, 
i.e., anumber. Examples of scalars are time, temperature, length, distance, speed, density, 
energy, and voltage. 
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In contrast, a vector is a quantity that has both magnitude and direction. We can say 
that a vector is an arrow or a directed line segment. For example, a velocity vector has 
length or magnitude, which is speed, and direction, which indicates the direction of motion. 
Typical examples of vectors are displacement, velocity, and force, see Fig. 164 as an 
illustration. 

More formally, we have the following. We denote vectors by lowercase boldface letters 
a, b, v, etc. In handwriting you may use arrows, for instance, a (in place of a), b, etc. 

A vector (arrow) has a tail, called its initial point, and a tip, called its terminal point. 
This is motivated in the translation (displacement without rotation) of the triangle in 
Fig. 165, where the initial point P of the vector a is the original position of a point, and 
the terminal point Q is the terminal position of that point, its position after the translation. 
The length of the arrow equals the distance between P and Q. This is called the length 
(or magnitude) of the vector a and is denoted by |a]. Another name for /ength is norm 
(or Euclidean norm). 

A vector of length 1 is called a unit vector. 


Velocity 


7 Earth 


Fig. 164. Force and velocity Fig. 165. Translation 


Of course, we would like to calculate with vectors. For instance, we want to find the 
resultant of forces or compare parallel forces of different magnitude. This motivates our 
next ideas: to define components of a vector, and then the two basic algebraic operations 
of vector addition and scalar multiplication. 

For this we must first define equality of vectors in a way that is practical in connection 
with forces and other applications. 


Equality of Vectors 


Two vectors a and b are equal, written a = b, if they have the same length and the 
same direction [as explained in Fig. 166; in particular, note (B)]. Hence a vector 
can be arbitrarily translated; that is, its initial point can be chosen arbitrarily. 


YA A yf & 


Equal vectors, Vectors having Vectors having Vectors having 
a=b the same length the same direction different length 
(A) but different but different and different 
direction length direction 
(B) (C) (D) 


Fig. 166. (A) Equal vectors. (B)-(D) Different vectors 


356 


EXAMPLE 1 


CHAP. 9 Vector Differential Calculus. Grad, Div, Curl 


Components of a Vector 


We choose an xyz Cartesian coordinate system! in space (Fig. 167), that is, a usual 
rectangular coordinate system with the same scale of measurement on the three mutually 
perpendicular coordinate axes. Let a be a given vector with initial point P: (x4, yi, z1) and 
terminal point Q: (x2, ye, Z2). Then the three coordinate differences 


(1) Gi AS Nn Ga a= i, Ba 2 <8 
are called the components of the vector a with respect to that coordinate system, and we 
write simply a = [d1, dg, a3]. See Fig. 168. 


The length |a| of a can now readily be expressed in terms of components because from 
(1) and the Pythagorean theorem we have 


(2) lal = Vaj + a3 + a. 


Components and Length of a Vector 


The vector a with initial point P: (4, 0, 2) and terminal point Q: (6, —1, 2) has the components 


a4, =6-4=2, ag, 1=0 1K a3 =2—-—2=0. 


Hence a = [2, —1, 0]. (Can you sketch a, as in Fig. 168?) Equation (2) gives the length 
lal = V2? + (-1)7 + 07 = V5. 


If we choose (—1, 5, 8) as the initial point of a, the corresponding terminal point is (1, 4, 8). 

If we choose the origin (0, 0, 0) as the initial point of a, the corresponding terminal point is (2, —1, 0); its 
coordinates equal the components of a. This suggests that we can determine each point in space by a vector, 
called the position vector of the point, as follows. | 


A Cartesian coordinate system being given, the position vector r of a point A: (x, y, z) 
is the vector with the origin (0, 0, 0) as the initial point and A as the terminal point (see 
Fig. 169). Thus in components, r = [x, y, z]. This can be seen directly from (1) with 
Xp =y1 = 21 = 0. 


Fig. 167. Cartesian Fig. 168. Components Fig. 169. Position vector r 
coordinate system of a vector of a point A: (x, y, z) 


1Named after the French philosopher and mathematician RENATUS CARTESIUS, latinized for RENE 
DESCARTES (1596-1650), who invented analytic geometry. His basic work Géométrie appeared in 1637, as 
an appendix to his Discours de la méthode. 
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Furthermore, if we translate a vector a, with initial point P and terminal point Q, then 
corresponding coordinates of P and Q change by the same amount, so that the differences 
in (1) remain unchanged. This proves 


THEOREM=—1 Vectors as Ordered Triples of Real Numbers 


A fixed Cartesian coordinate system being given, each vector is uniquely determined 
by its ordered triple of corresponding components. Conversely, to each ordered 
triple of real numbers (ay, dg, a3) there corresponds precisely one vector 
a = [d4, de, a3], with (0, 0, 0) corresponding to the zero vector 0, which has length 
0 and no direction. 

Hence a vector equation a = b is equivalent to the three equations a, = by, 
dg, = be, ag = bg for the components. 


We now see that from our “geometric” definition of a vector as an arrow we have arrived 
at an “algebraic” characterization of a vector by Theorem 1. We could have started from 
the latter and reversed our process. This shows that the two approaches are equivalent. 


Vector Addition, Scalar Multiplication 


Calculations with vectors are very useful and are almost as simple as the arithmetic for 
real numbers. Vector arithmetic follows almost naturally from applications. We first define 
how to add vectors and later on how to multiply a vector by a number. 


DEFINITION Addition of Vectors 


The sum a + b of two vectors a = [a, do, a3] and b = [bj, bo, bg] is obtained by 
adding the corresponding components, 


(3) at b= [ay al by, dg at bo, a3 =e bs]. 
Fig. 170. Vector Geometrically, place the vectors as in Fig. 170 (the initial point of b at the terminal 
addition point of a); then a + b is the vector drawn from the initial point of a to the terminal 
point of b. 


For forces, this addition is the parallelogram law by which we obtain the resultant of two 
forces in mechanics. See Fig. 171. 

Figure 172 shows (for the plane) that the “algebraic” way and the “geometric way” of 
vector addition give the same vector. 


Fig. 171. Resultant of two forces (parallelogram law) 


358 


DEFINITION 
z a, y 
a 2a -a - 


Fig. 175. Scalar 
multiplication 
[multiplication of 
vectors by scalars 
(numbers)] 


Nie 
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Basic Properties of Vector Addition. 


(4) 


(a) at+b=bta 

(b) (u+v)+w=u+(v+w) 
(c) a+0=0+a=a 
(d) a+ (—a) = 0. 


Familiar laws for real numbers give immediately 


(Commutativity) 


(Associativity) 


Properties (a) and (b) are verified geometrically in Figs. 173 and 174. Furthermore, —a 
denotes the vector having the length |a| and the direction opposite to that of a. 


i) 


b 


Fig. 172. Vector addition 


Fig. 173. 
of vector addition 


Cummutativity 


vw 


a / 


Fig. 174. Associativity 
of vector addition 


In (4b) we may simply write u + v + w, and similarly for sums of more than three 
vectors. Instead of a + a we also write 2a, and so on. This (and the notation —a used 
just before) motivates defining the second algebraic operation for vectors as follows. 


Scalar Multiplication (Multiplication by a Number) 


The product ca of any vector a = [a}, dg, ag] and any scalar c (real number c) is 
the vector obtained by multiplying each component of a by c, 


(5) ca = [caj, cdg, Ca3]. 


Geometrically, if a # 0, then ca with c > 0 has the direction of a and with c < 0 
the direction opposite to a. In any case, the length of ca is \ca| = Ic| lal ,andca = 0 


if a = 0 or c = 0 (or both). (See Fig. 175.) 


Ba 


(6) 


sic Properties of Scalar Multiplication. 


(a) ciat+b) =ca+tcb 
(b) (c + ka =ca+tka 
(c) c(ka) = (ck)a 
(d) la=a. 


From the definitions we obtain directly 


(written cka) 


SEC. 9.1 Vectors in 2-Space and 3-Space 359 


EXAMPLE 2 


EXAMPLE-3 


You may prove that (4) and (6) imply for any vector a 


(a) 0a = 0 
(7) 
(b) (—la=-—a. 


Instead of b + (—a) we simply write b — a (Fig. 176). 


Vector Addition. Multiplication by Scalars 


With respect to a given coordinate system, let 


a=[4,0,1] and b= [2,—-5,23]. 


Then —a = [—4,0,—1], 7a = [28,0,7], a+b = [6,—5,$], and 


Qa — b) = 2[2, 5, 3] = [4, 10, $] = 2a — 2b. a 


Unit Vectors i, j, k. Besides a = [ay, ag, a3] another popular way of writing vectors is 
(8) a ayi ote doj te agk. 


In this representation, i, j, k are the unit vectors in the positive directions of the axes of 
a Cartesian coordinate system (Fig. 177). Hence, in components, 


(9) i=[1,0,0)" j= i010), k>=10,0, 1 
and the right side of (8) is a sum of three vectors parallel to the three axes. 


ijk Notation for Vectors 


In Example 2 we have a = 4i + k, b = 2i — 5j + 3k, and so on. a] 


All the vectors a = [dy, dg, a3] = ayi + agj + agk (with real numbers as components) 
form the real vector space R® with the two algebraic operations of vector addition and 
scalar multiplication as just defined. R® has dimension 3. The triple of vectors i, j, k 
is called a standard basis of R®. Given a Cartesian coordinate system, the representation 
(8) of a given vector is unique. 

Vector space R? is a model of a general vector space, as discussed in Sec. 7.9, but is 
not needed in this chapter. 


Fig. 176. Difference of vectors Fig. 177. The unit vectors i, j, k 
and the representation (8) 
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1-5| COMPONENTS AND LENGTH 


Find the components of the vector v with initial point P 
and terminal point Q. Find |v]. Sketch |v|. Find the unit 
vector u in the direction of v. 


1. P:(1, 1,0), Q: (6,2, 0) 

2. P:(1,1,1), Q:(2,2,0) 

3. P: (—3.0, 4,0, —0.5),  Q: (5.5, 0, 1.2) 
4. P:(1,4,2), Q:(—-1, —4, -2) 

5. P:(0,0,0), Q: (2, 1, —2) 


Find the terminal point Q of the vector v with 
components as given and initial point P. Find |v]. 

6. 4,0,0; P: (0, 2, 13) 

Tegste=ay Pig; =3,4) 

8. 13.1,0.8, —2.0;  P: (0, 0, 0) 


9. 6,1, —4; P:(—6, —1, —4) 

10. 0, —3,3; P: (0,3, —3) 

11-18} ADDITION, SCALAR MULTIPLICATION 
Let a = [3, 2,0] = 31 + 2j; b = [—4, 6, 0] = 4i + 6j, 
ce = [5, -1, 8] = 5i-—j+ 8k, d= [0,0,4] = 4k. 
Find: 


11. 2a, da, —a 
12. (a+b)+ec, at+(b+c) 


13.b+¢, c+b 

14, 3c — 6d, 3(c — 2d) 
15. 7(¢ — b), 7e — 7b 
16. 2a — 3c, 9Ga— 30) 
17. (7 — 3)a, 7a — 3a 
18. da + 3b, —4a — 3b 


19. What laws do Probs. 12-16 illustrate? 
20. Prove Eqs. (4) and (6). 


21-25| FORCES, RESULTANT 


Find the resultant in terms of components and _ its 
magnitude. 
21. p = [2,3,0], q =[0,6, 1], u = [2,0, —4] 
22. p = [1, —2,3], q = [3, 21, —16], 

u = [—4, —19, 13] 
23. u = [8,-1,0], v= [5.0.3], 
24. p= [-1,2, -3], q=[1,1, 0), 
25. u = [3, 1, —6],  v = [0, 2,5], 


w=[-a.L471 
u = [1, —2, 2] 
w= [3,=1,-13] 
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PROBLEM SET 9-1 


26-37 


FORCES, VELOCITIES 


26. 


27. 


28. 


29. 


30. 


31. 


32. 


33. 


34. 


35. 


36. 


37. 


Equilibrium. Find v such that p, q, u in Prob. 21 and 
v are in equilibrium. 


Find p such that u, v, w in Prob. 23 and p are in 
equilibrium. 

Unit vector. Find the unit vector in the direction of 
the resultant in Prob. 24. 


Restricted resultant. Find all v such that the resultant 
of v, p, gq, u with p, q, u as in Prob. 21 is parallel to 
the xy-plane. 


Find v such that the resultant of p, q, u, v with p, 
q, u as in Prob. 24 has no components in x- and 
y-directions. 


For what k is the resultant of [2, 0, —7], [1, 2, —3], and 
[0, 3, k] parallel to the xy-plane? 


If |p| = 6 and |q| = 4, what can you say about the 
magnitude and direction of the resultant? Can you think 
of an application to robotics? 


Same question as in Prob. 32 if |p| = 9, |q| = 6, 
lu| = 3, 


Relative velocity. If airplanes A and B are moving 
southwest with speed |v,| = 550 mph, and north- 
west with speed |vg| = 450 mph, respectively, what 
is the relative velocity V = Vg — vy, of B with respect 
to A? 


Same question as in Prob. 34 for two ships moving 
northeast with speed |v4| = 22 knots and west with 
speed |vg| = 19 knots. 


Reflection. If a ray of light is reflected once in each 
of two mutually perpendicular mirrors, what can you 
say about the reflected ray? 


Force polygon. Truss. Find the forces in the system 
of two rods (truss) in the figure, where |p| = 1000 nt. 
Hint. Forces in equilibrium form a polygon, the force 
polygon. 


Jy 
= 
45° x 
p v p 
u 
Truss Force polygon 
Problem 37 
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38. TEAM PROJECT. Geometric Applications. To 


increase your skill in dealing with vectors, use vectors 
to prove the following (see the figures). 


(a) The diagonals of a parallelogram bisect each other. 


(b) The line through the midpoints of adjacent sides 
of a parallelogram bisects one of the diagonals in the 
ratio 1:3. 

(c) Obtain (b) from (a). 

(d) The three medians of a triangle (the segments 
from a vertex to the midpoint of the opposite side) 
meet at a single point, which divides the medians in 
the ratio 2: 1. 

(e) The quadrilateral whose vertices are the mid- 
points of the sides of an arbitrary quadrilateral is a 
parallelogram. 

(f) The four space diagonals of a parallelepiped meet 
and bisect each other. 

(g) The sum of the vectors drawn from the center of 
a regular polygon to its vertices is the zero vector. 


9.2 Inner Product (Dot Product) 


Orthogonality 


Team Project 38(a) 


BS 


Team Project 38(d) 


Team Project 38(e) 
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The inner product or dot product can be motivated by calculating work done by a constant 
force, determining components of forces, or other applications. It involves the length of 
vectors and the angle between them. The inner product is a kind of multiplication of two 
vectors, defined in such a way that the outcome is a scalar. Indeed, another term for inner 
product is scalar product, a term we shall not use here. The definition of the inner product 


is as follows. 


DEFINITION 


Inner Product (Dot Product) of Vectors 


The inner product or dot product a « b (read “a dot b’”) of two vectors a and b is 
the product of their lengths times the cosine of their angle (see Fig. 178), 


a*b = |al|b| cos y if a#+0,b4#0 
(1) 
aecb=0 if a=Oorb=0. 


The angle y,0 = y =77, between a and b is measured when the initial points of 
the vectors coincide, as in Fig. 178. In components, a = [a1, dg, a3], b = [by, be, bs], 


and 


(2) aeb= ayb, + dabo =F agb3. 
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THEOREM 1 


EXAMPLE 1 
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The second line in (1) is needed because y is undefined when a = 0 or b = 0. The 
derivation of (2) from (1) is shown below. 


Y 
a * Y 
yr a 
b b b 
asb>0 asb=0 ash <0 


(orthogonality) 


Fig. 178. Angle between vectors and value of inner product 


Orthogonality. Since the cosine in (1) may be positive, 0, or negative, so may be the 
inner product (Fig. 178). The case that the inner product is zero is of particular practical 
interest and suggests the following concept. 


A vector a is called orthogonal to a vector b if a* b = 0. Then b is also orthogonal 
to a, and we call a and b orthogonal vectors. Clearly, this happens for nonzero vectors 
if and only if cos y = 0; thus y = 77/2 (90°). This proves the important 


Orthogonality Criterion 


The inner product of two nonzero vectors is 0 if and only if these vectors are 
perpendicular. 


Length and Angle. Equation (1) with b = a gives ava = |a|”. Hence 
(3) lal = Vaea. 
From (3) and (1) we obtain for the angle y between two nonzero vectors 


aeb _ aeb 
lal|b] VaeaVbeb. 


(4) cos y = 


Inner Product. Angle Between Vectors 


Find the inner product and the lengths of a = [1, 2, 0] and b = [3, —2, 1] as well as the angle between these 
vectors. 


Solution. asb=1-:3+2-(-2)+0°1 1, lal = Vaca = V5, |b] = Vb-b = V14, and (4) 
gives the angle 


Yy = arccos = arccos (—0.11952) = 1.69061 = 96.865°. | 
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From the definition we see that the inner product has the following properties. For any 
vectors a, b, ¢ and scalars 41, ge, 


(a) = (qia + gob) ¢ = qiare + qibee (Linearity) 
(b) acb=bea (Symmetry) 
(5) 
acaZ0 
(c) (Positive-definiteness). 
aeca=0 ifandonlyif a=0 


Hence dot multiplication is commutative as shown by (5b). Furthermore, it is distributive 
with respect to vector addition. This follows from (5a) with qj = 1 and gg = |: 


(5a*) (a+ b)*c=aect+ bee (Distributivity). 
Furthermore, from (1) and |cos y| = 1 we see that 

(6) la*b| S |al|b| (Cauchy-Schwarz inequality). 

Using this and (3), you may prove (see Prob. 16) 

(7) la + b| = [al + |b| (Triangle inequality). 


Geometrically, (7) with < says that one side of a triangle must be shorter than the other 
two sides together; this motivates the name of (7). 
A simple direct calculation with inner products shows that 


(8) la + bl? + ja - bl? = 2(\al? + |b|?) (Parallelogram equality). 


Equations (6)-(8) play a basic role in so-called Hilbert spaces, which are abstract inner 
product spaces. Hilbert spaces form the basis of quantum mechanics, for details see 
[GenRef7] listed in App. 1. 


Derivation of (2) from (1). We write a = ayi + agj + agk and b = byi + boj + bsk, 
as in (8) of Sec. 9.1. If we substitute this into a * b and use (5a*), we first have a sum of 
3 X 3 = 9 products 


acb = abyieit ayboiej + --: + agb3k ek. 


Now i, j, k are unit vectors, so thatie i = j*j = k*k = 1 by (3). Since the coordinate 
axes are perpendicular, so are i, j, k, and Theorem | implies that the other six of those 
nine products are 0, namely, iej =jei=jek =kej =kei=i+ek =0. But this 
reduces our sum for a ¢ b to (2). @ 
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EXAMPLE 2 


EXAMPLE 3 
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Applications of Inner Products 


Typical applications of inner products are shown in the following examples and in 
Problem Set 9.2. 


Work Done by a Force Expressed as an Inner Product 


This is a major application. It concerns a body on which a constant force p acts. (For a variable force, see 
Sec. 10.1.) Let the body be given a displacement d. Then the work done by p in the displacement is defined as 


(9) W = |pl|id| cosa = ped, 


that is, magnitude |p| of the force times length |d| of the displacement times the cosine of the angle a between 
p and d (Fig. 179). If a < 90°, as in Fig. 179, then W > 0. If p and d are orthogonal, then the work is zero 
(why?). If a > 90°, then W < 0, which means that in the displacement one has to do work against the force. 
For example, think of swimming across a river at some angle a@ against the current. 


d 
Fig. 179. Work done by a force Fig. 180. Example 3 


Component of a Force in a Given Direction 


What force in the rope in Fig. 180 will hold a car of 5000 Ib in equilibrium if the ramp makes an angle of 25° 
with the horizontal? 


Solution. Introducing coordinates as shown, the weight is a = [0, —5000] because this force points 
downward, in the negative y-direction. We have to represent a as a sum (resultant) of two forces, a = ec + p, 
where c is the force the car exerts on the ramp, which is of no interest to us, and p is parallel to the rope. A 
vector in the direction of the rope is (see Fig. 180) 


b = [-1, tan 25°] = [—1, 0.46631], thus |b] = 1.10338, 


The direction of the unit vector u is opposite to the direction of the rope so that 
1 
u= “et = [0.90631, —0.42262]. 
b 


Since |u| = 1 and cos y > 0, we see that we can write our result as 


aeb 5000 - 0.46631 
|b| 1.10338 


Ip! = (lal cos y)|ul = acu 2113 [1b]. 


We can also note that y = 90° — 25° = 65° is the angle between a and p so that 
Ip| = lal cos y = 5000 cos 65° = 2113 [1b]. 


Answer: About 2100 Ib. |_| 
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EXAMPLE 4 


Example 3 is typical of applications that deal with the component or projection of a 
vector a in the direction of a vector b(#0). If we denote by p the length of the orthogonal 
projection of a on a straight line / parallel to b as shown in Fig. 181, then 


(10) p= lal cos y. 


Here p is taken with the plus sign if pb has the direction of b and with the minus sign if 
pb has the direction opposite to b. 


a 
a I y a Y 
I 
1 i | l 1 
b b 
ae 
p 


(p > 0) (p = 0) (p <0) 
Fig. 181. Component of a vector a in the direction of a vector b 
Multiplying (10) by |b| / |b| = 1, we have a b in the numerator and thus 


_acb 
|b| 


(11) Pp (b # 0). 


If b is a unit vector, as it is often used for fixing a direction, then (11) simply gives 
(12) p=arb ({b| = 1). 


Figure 182 shows the projection p of a in the direction of b (as in Fig. 181) and the 
projection g = |b| cos y of b in the direction of a. 


aa 
ay Se 


Fig. 182. Projections p of aon b and q of bona 


Orthonormal Basis 


By definition, an orthonormal basis for 3-space is a basis {a, b, c} consisting of orthogonal unit vectors. It has 
the great advantage that the determination of the coefficients in representations v = /,a + /gb + Isc of a given 
vector v is very simple. We claim that /} = a* v,/2 = b* v,/3 = ¢* v. Indeed, this follows simply by taking 
the inner products of the representation with a, b, c, respectively, and using the orthonormality of the basis, 
aev=laca+t loaeb + Igaece = 14, etc. 

For example, the unit vectors i, j, k in (8), Sec. 9.1, associated with a Cartesian coordinate system form an 
orthonormal basis, called the standard basis with respect to the given coordinate system. @ 
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EXAMPLE 5 


EXAMPLE 6 
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Orthogonal Straight Lines in the Plane 


Find the straight line L; through the point P: (1, 3) in the xy-plane and perpendicular to the straight line 
Lg:x — 2y + 2 = 0; see Fig. 183. 


Solution. The idea is to write a general straight line Ly: a,x + agy = casa*r =c witha = [ay, ao] #0 
and r = [x, y], according to (2). Now the line Lj through the origin and parallel to Ly is as r = 0. Hence, by 
Theorem 1, the vector a is perpendicular to r. Hence it is perpendicular to L} and also to Ly because L; and 
Lj are parallel. a is called a normal vector of L; (and of Lj). 

Now a normal vector of the given line x — 2y + 2 = 0 is b = [1, —2]. Thus Ly is perpendicular to Ly 
if b* a = ay — 2a = 0, for instance, if a = [2, 1]. Hence Ly is given by 2x + y = c. It passes through 
P:(1,3) when 2-1+3=c=5. Answer: y= —2x+ 5. Show that the point of intersection is 
(x, y) = (1.6, 1.8). = 


Normal Vector to a Plane 
Find a unit vector perpendicular to the plane 4x + 2y + 4z = —7. 


Solution. Using (2), we may write any plane in space as 
(13) aer=ayx + ay + agz=c 


where a = [ay, dg, a3] # 0 and r = [y, y, z]. The unit vector in the direction of a is (Fig. 184) 


1 
n=—a. 
lal 
Dividing by lal, we obtain from (13) 
c 
(14) ner=p where p= ial 
a 


From (12) we see that p is the projection of r in the direction of n. This projection has the same constant value 
c/|a| for the position vector r of any point in the plane. Clearly this holds if and only if n is perpendicular to 
the plane. n is called a unit normal vector of the plane (the other being —n). 

Furthermore, from this and the definition of projection, it follows that |p| is the distance of the plane from 
the origin. Representation (14) is called Hesse’s” normal form of a plane. In our case, a = [4, 2, 4], c = —7, 
lal =6,n= ga = &, 3, 21, and the plane has the distance é from the origin. o 


Fig. 183. Example 5 Fig. 184. Normal vector to a plane 


2LUDWIG OTTO HESSE (1811-1874), German mathematician who contributed to the theory of curves and 
surfaces. 
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1-10| INNER PRODUCT 


Let a = [1, -3,5], b= [4,0,8], ce = [-2,9, 1]. 
Find: 

l.aeb, bea, bee 

2. (-3a+ 5e)*b, 15(a-—c¢)*b 

3. lal, [2b], |—el 

4. |a+bl, lal + [bl 

5. |b+ cl, |b| + lel 

6. |a + el? + la — el? — 2(\al? + |e|?) 
7. |aecl, — falle| 

8. 5ae13b, 65a°b 

9, l5aeb+ l5aec, I5ae(b+c) 
10. ar: (b—c), (a-—b)ece 

11-16} GENERAL PROBLEMS 


11. What laws do Probs. | and 4—7 illustrate? 

12. What does u* v = u* w imply if u = 0? If u 4 0? 

13. Prove the Cauchy—Schwarz inequality. 

14. Verify the Cauchy—Schwarz and triangle inequalities 
for the above a and b. 

15. Prove the parallelogram equality. Explain its name. 

16. Triangle inequality. Prove Eq. (7). Hint. Use Eq. (3) 
for |a + b| and Eq. (6) to prove the square of Eq. (7), 
then take roots. 


17-20| WORK 


Find the work done by a force p acting on a body if the 
body is displaced along the straight segment AB from A to 
B. Sketch AB and p. Show the details. 


17. p = [2,5,0], A: (1,3,3),  B: (3,5, 5) 

18. p = [—1, —2, 4], A:(0,0,0), B: (6,7, 5) 
19. p = [0,4,3], A:(4,5,-1), B: (1,3, 0) 
20. p = [6, —3, -3], A:(1,5,2), B: (3,4, 1) 


21. Resultant. Is the work done by the resultant of two 
forces in a displacement the sum of the work done 
by each of the forces separately? Give proof or 
counterexample. 


22-30 | ANGLE BETWEEN VECTORS 


Let a = [1, 1, 0], b = [3, 2, 1], ande = [1, 0, 2]. Find the 
angle between: 


22. a, b 
23. b, ¢ 
24,.a+c, bre 
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PROBLEM SET 9-2 


25. What will happen to the angle in Prob. 24 if we replace 
c by ne with larger and larger n? 

26. Cosine law. Deduce the law of cosines by using 
vectors a, b, and a — b. 


27. Addition law. cos (a — B) = cosacosB + sina 
sin 8. Obtain this by using a = [cosa, sina], 
b = [cos B, sin B] where0O Sa SB S27. 


28. Triangle. Find the angles of the triangle with vertices 
A: (0, 0, 2), B: (3, 0,2), and C:(1, 1,1). Sketch the 
triangle. 


29. Parallelogram. Find the angles if the vertices are 
(0, 0), (6, 0), (8, 3), and (2, 3). 


30. Distance. Find the distance of the point A: (1, 0, 2) 
from the plane P: 3x + y + z = 9. Make a sketch. 


31-35 | ORTHOGONALITY is particularly important, 
mainly because of orthogonal coordinates, such as Cartesian 


coordinates, whose natural basis [Eq. (9), Sec. 9.1], consists 
of three orthogonal unit vectors. 


31. For what values of a, are [ay, 4,3] and [3, —2, 12] 
orthogonal? 


32. Planes. For what c are 3x + z=5 and 8x -—y+ 
cz = 9 orthogonal? 


33. Unit vectors. Find all unit vectors a = [ay, dg] in the 
plane orthogonal to [4, 3]. 


34. Corner reflector. Find the angle between a light ray 
and its reflection in three orthogonal plane mirrors, 
known as corner reflector. 


35. Parallelogram. When will the diagonals be ortho- 
gonal? Give a proof. 


COMPONENT IN THE DIRECTION 
OF A VECTOR 


Find the component of a in the direction of b. Make a 
sketch. 


36-40 


36. a= [1,1,1], b= [2, 1,3] 
37. a = [3,4, 0], b = [4, —3, 2] 
38. a = [8, 2,0], b= [-4, -1,0] 


39. When will the component (the projection) of a in the 
direction of b be equal to the component (the 
projection) of b in the direction of a? First guess. 


40. What happens to the component of a in the direction 
of b if you change the length of b? 
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9.3 Vector Product (Cross Product) 


DEFINITION 


We shall define another form of multiplication of vectors, inspired by applications, whose 
result will be a vector. This is in contrast to the dot product of Sec. 9.2 where multiplication 
resulted in a scalar. We can construct a vector v that is perpendicular to two vectors a 
and b, which are two sides of a parallelogram on a plane in space as indicated in Fig. 185, 
such that the length |v| is numerically equal to the area of that parallelogram. Here then 
is the new concept. 


Vector Product (Cross Product, Outer Product) of Vectors 


The vector product or cross product a X b (read “a cross b”) of two vectors a 
and b is the vector v denoted by 


=aXb 


I. If a = 0 orb = O, then we define v = a X b = 0. 
II. If both vectors are nonzero vectors, then vector v has the length 


(1) lv] = |a x bl = [al|b] sin y, 


where y is the angle between a and b as in Sec. 9.2. 


Furthermore, by design, a and b form the sides of a parallelogram on a plane 
in space. The parallelogram is shaded in blue in Fig. 185. The area of this blue 
parallelogram is precisely given by Eq. (1), so that the length |v| of the vector 
v is equal to the area of that parallelogram. 

Ill. If a and b lie in the same straight line, i.e., a and b have the same or opposite 
directions, then y is 0° or 180° so that sin y = 0. In that case lv| = 0 so that 
v=axb=0. 

IV. If cases I and III do not occur, then v is a nonzero vector. The direction of 
v = aX bis perpendicular to both a and b such that a, b, v—precisely in this 
order (!)—form a right-handed triple as shown in Figs. 185-187 and explained 
below. 


Another term for vector product is outer product. 


Remark. Note that I and HI completely characterize the exceptional case when the cross 
product is equal to the zero vector, and II and IV the regular case where the cross product 
is perpendicular to two vectors. 

Just as we did with the dot product, we would also like to express the cross product in 
components. Let a = [a}, da, a3] and b = [by, be, bs]. Then v = [v1, v2, v3] = a X bhas 
the components 


(2) V1 = dgb3 — azbo, Vg = agby — aybs, V3 = aybz — agby. 


Here the Cartesian coordinate system is right-handed, as explained below (see also 
Fig. 188). (For a left-handed system, each component of v must be multiplied by —1. 
Derivation of (2) in App. 4.) 
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Right-Handed Triple. A triple of vectors a, b, v is right-handed if the vectors in the 
given order assume the same sort of orientation as the thumb, index finger, and middle 
finger of the right hand when these are held as in Fig. 186. We may also say that if a is 
rotated into the direction of b through the angle y (<7r), then v advances in the same 
direction as a right-handed screw would if turned in the same way (Fig. 187). 


sae 
a 
Fig. 185. Vector product Fig. 186. Right-handed Fig. 187. Right-handed 
triple of vectors a, b, v screw 


Right-Handed Cartesian Coordinate System. The system is called right-handed if 
the corresponding unit vectors i, j, k in the positive directions of the axes (see Sec. 9.1) 
form a right-handed triple as in Fig. 188a. The system is called left-handed if the sense 
of k is reversed, as in Fig. 188b. In applications, we prefer right-handed systems. 


(a) Right-handed (b) Left-handed 


Fig. 188. The two types of Cartesian coordinate systems 


How to Memorize (2). If you know second- and third-order determinants, you see that 
(2) can be written 


Q*) u= 
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EXAMPLE 1 


EXAMPLE 2 


THEOREM -1 


axb 


I 
I 
I 
bxay 


Fig. 189. 
Anticommutativity 
of cross 
multiplication 


CHAP. 9 Vector Differential Calculus. Grad, Div, Curl 


and v = [U4, Vg, v3] = Uzi + Vej + U3k is the expansion of the following symbolic 
determinant by its first row. (We call the determinant “symbolic” because the first row 
consists of vectors rather than of numbers.) 


i j«k 
a2 a3 a 43 a, a2 
(2**) v=aXb=l/a, dag a3)= i- j+ k. 
by bg by bg by be 
by by bg 


For a left-handed system the determinant has a minus sign in front. 


Vector Product 
For the vector product v = a X b of a = [1, 1, 0] and b = [3, 0, 0] in right-handed coordinates we obtain 
from (2) 

Vv; = 0, Vg = 0, v3 =1-0-1:°:3=—-3. 


We confirm this by (2**): 


1 1 


i k = —3k = [0, 0, —3]. 


0 0 3.0 


3 0 O 


To check the result in this simple case, sketch a, b, and v. Can you see that two vectors in the xy-plane must 
always have their vector product parallel to the z-axis (or equal to the zero vector)? | 


Vector Products of the Standard Basis Vectors 


(3) 


We shall use this in the next proof. | 


General Properties of Vector Products 


(a) For every scalar l, 
(4) (la) X b = (a X b) = a X (ib). 
(b) Cross multiplication is distributive with respect to vector addition; that is, 


(a) aX (b+ c)=(aX b)+ (aX o), 


(5) 
(B) (a+b) Xc=(aXce)+ (bX c). 


(c) Cross multiplication is not commutative but anticommutative; that is, 


(6) b X a= —(a X b) (Fig. 189). 
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PROOF 


EXAMPLE 3 


(d) Cross multiplication is not associative; that is, in general, 
(7) ax(bxc)#(axb)xXe 


so that the parentheses cannot be omitted. 


Equation (4) follows directly from the definition. In (5a), formula (2*) gives for the first 
component on the left 


ag, a3 
= do(b3 + cz) — ag(bz + ce) 


bg + co bz + cg 


(dgb3 — agbz) + (dgc3 — agce) 


42 «a3 


bz bg C2 C3 


By (2*) the sum of the two determinants is the first component of (a X b) + (a X c), the 
right side of (Sa). For the other components in (5a) and in 5(), equality follows by the 
same idea. 

Anticommutativity (6) follows from (2**) by noting that the interchange of Rows 2 
and 3 multiplies the determinant by —1. We can confirm this geometrically if we set 


a X b = vandb X a = w; then lv| = |w| by (1), and for b, a, w to form a right-handed 
triple, we must have w = —v. 

Finally, i x (i X j) =i X k = —j, whereas (i X i) X j = 0 X j = 0 (see Example 2). 
This proves (7). a] 


Typical Applications of Vector Products 


Moment of a Force 


In mechanics the moment m of a force p about a point Q is defined as the product m = |p|d, where d is the 
(perpendicular) distance between Q and the line of action L of p (Fig. 190). If r is the vector from Q to any 
point A on L, then d = |r| sin y, as shown in Fig. 190, and 


m = |r||p| sin y. 
Since y is the angle between r and p, we see from (1) that m = |r X p|. The vector 


(8) m=rxp 


is called the moment vector or vector moment of p about Q. Its magnitude is m. If m # 0, its direction is 
that of the axis of the rotation about Q that p has the tendency to produce. This axis is perpendicular to both 
rand p. | 


Fig. 190. Moment of a force p 
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EXAMPLE 4 


EXAMPLE 5 
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Moment of a Force 
Find the moment of the force p about the center Q of a wheel, as given in Fig. 191. 


Solution. Introducing coordinates as shown in Fig. 191, we have 


p = [1000 cos 30°, 1000 sin 30°, 0] = [866, 500, OJ, r=[0, 1.5, Oj. 


(Note that the center of the wheel is at y = —1.5 on the y-axis.) Hence (8) and (2**) give 
i j k 
0 1.5 
m=rxp=| 0 1.5 O| = Oi — Oj + k = [0, 0, —1299]. 
866 500 


866 500 0 


This moment vector m is normal, i.e., perpendicular to the plane of the wheel. Hence it has the direction of the 
axis of rotation about the center Q of the wheel that the force p has the tendency to produce. The moment m 
points in the negative z-direction, This is, the direction in which a right-handed screw would advance if turned 
in that way. lea] 


|p| = 1000 Ib 


Fig. 191. Moment of a force p 


Velocity of a Rotating Body 


A rotation of a rigid body B in space can be simply and uniquely described by a vector w as follows. The 
direction of w is that of the axis of rotation and such that the rotation appears clockwise if one looks from the 
initial point of w to its terminal point. The length of w is equal to the angular speed w(>0) of the rotation, 
that is, the linear (or tangential) speed of a point of B divided by its distance from the axis of rotation. 

Let P be any point of B and d its distance from the axis. Then P has the speed wd. Let r be the position vector 
of P referred to a coordinate system with origin 0 on the axis of rotation. Then d = |r| sin y, where y is the 
angle between w and r. Therefore, 


od = |w\|r| sin y = |w x rl. 


From this and the definition of vector product we see that the velocity vector v of P can be represented in the 
form (Fig. 192) 


(9) Ve WE: 


This simple formula is useful for determining v at any point of B. B 


Fig. 192. Rotation of a rigid body 
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Scalar Triple Product 


Certain products of vectors, having three or more factors, occur in applications. The most 
important of these products is the scalar triple product or mixed product of three vectors 
a, b, ¢. 


(10*) (a b c)=ar(bX oc). 


The scalar triple product is indeed a scalar since (10*) involves a dot product, which in 
turn is a scalar. We want to express the scalar triple product in components and as a third- 
order determinant. To this end, let a = [a, dg, ag], b = [b1, be, bg], and ¢ = [cy, ce, C3]. 
Also set b X ¢ = Vv = [U4, Va, U3]. Then from the dot product in components [formula 
(2) in Sec. 9.2] and from (2*) with b and c instead of a and b we first obtain 


ae(b X c) =aev = ayvy + agqvg + agl3 


The sum on the right is the expansion of a third-order determinant by its first row. Thus 
we obtain the desired formula for the scalar triple product, that is, 


a4, a2 43 
(10) (a b c)=ar(bxXec)=|h, do bol. 


C1 C2 CB 


The most important properties of the scalar triple product are as follows. 


THEOREM 2 Properties and Applications of Scalar Triple Products 


(a) In (10) the dot and cross can be interchanged: 
(11) (a b c)=ar(bxXc)=(axb)ec. 


(b) Geometric interpretation. The absolute value |(a b  c)| of (10) is the 
volume of the parallelepiped (oblique box) with a, b, ¢ as edge vectors (Fig. 193). 

(c) Linear independence. Three vectors in R® are linearly independent if 
and only if their scalar triple product is not zero. 


PROOF (a) Dot multiplication is commutative, so that by (10) 
Cy C2 C3 
(aX b)ec=cr(axX b)=|a, a asl. 


by bg bg 
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EXAMPLE 6 


Fig. 194. 
Tetrahedron 
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From this we obtain the determinant in (10) by interchanging Rows | and 2 and in the 
result Rows 2 and 3. But this does not change the value of the determinant because each 
interchange produces a factor —1, and (—1)(—1) = 1. This proves (11). 

(b) The volume of that box equals the height = lal|cos y| (Fig. 193) times the area 
of the base, which is the area |b X c| of the parallelogram with sides b and ec. Hence the 
volume is 


|al|b x e| |cos y| = |a* (b X o)| (Fig. 193) 


as given by the absolute value of (11). 

(c) Three nonzero vectors, whose initial points coincide, are linearly independent if and 
only if the vectors do not lie in the same plane nor lie on the same straight line. 

This happens if and only if the triple product in (b) is not zero, so that the independence 
criterion follows. (The case of one of the vectors being the zero vector is trivial.) 


bxe 


Fig. 193. Geometric interpretation of a scalar triple product 


Tetrahedron 


A tetrahedron is determined by three edge vectors a, b, ¢, as indicated in Fig. 194. Find the volume of the tetrahedron 
in Fig. 194, when a = [2, 0, 3], b = [0, 4, 1], c = [5, 6, O]. 


Solution. The volume V of the parallelepiped with these vectors as edge vectors is the absolute value of the 
scalar triple product 


4 1 0 4 


+ 3 


6 0 5 6 


5 6 0 


Hence V = 72. The minus sign indicates that if the coordinates are right-handed, the triple a, b, ¢ is left-handed. 
The volume of a tetrahedron is 4 of that of the parallelepiped (can you prove it?), hence 12. 

Can you sketch the tetrahedron, choosing the origin as the common initial point of the vectors? What are the 
coordinates of the four vertices? @ 


This is the end of vector algebra (in space R® and in the plane). Vector calculus 
(differentiation) begins in the next section. 
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1-10| GENERAL PROBLEMS 4. Lagrange’s identity for |a x b|. Verify it for 


1. Give the details of the proofs of Eqs. (4) and (5). a = [3,4,2] and b =[I1,0,2]. Prove it, using 


2. What does a X b = a X c with a # 0 imply? 


sin” y=l1- cos” y. The identity is 


3. Give the details of the proofs of Eqs. (6) and (11). (12) la X b| = Via *a)(beb) — (ae b). 
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5. 


6. 


10. 


What happens in Example 3 of the text if you replace 
p by —p? 

What happens in Example 5 if you choose a P at 
distance 2d from the axis of rotation? 


. Rotation. A wheel is rotating about the y-axis with 


angular speed w = 20 sec +. The rotation appears 
clockwise if one looks from the origin in the positive 
y-direction. Find the velocity and speed at the point 
[8, 6, 0]. Make a sketch. 


. Rotation. What are the velocity and speed in Prob. 7 


at the point (4, 2, —2) if the wheel rotates about the 
line y = x,z = 0 with w = 10 sec 1? 


. Scalar triple product. What does (a b c) = 0 imply 


with respect to these vectors? 

WRITING REPORT. Summarize the most important 
applications discussed in this section. Give examples. 
No proofs. 


11-23 


VECTOR AND SCALAR 


TRIPLE PRODUCTS 


With respect to right-handed Cartesian coordinates, 


let 


a= [2,1,0], b= [—3,2,0], ¢=[1,4,—2], and 


d = [5, —1, 3]. Showing details, find: 


11. 
12. 
13. 
14. 
15. 
16. 
17. 
18. 
19. 
20. 
21. 
22. 
23. 
24. 


axb, bxXa, aeb 
3c X 5d, 15d Xe, 15d°c, 
ecxX(at+tb), axct+bxXe 
4b X 3c + 12c Xb 

(a + d) X (d+ a) 

(bX c)*ed, be(ce X d) 
(bX c)xd, bx (ec Xd) 
(aX b) Xa, aX (b X a) 
Gjk, @kj 

(aX b) X (ec Xd), (a b djc — (a b cd 
4b X 3c, 12|bx cl, 12|/ce x bl 
(a-—be-bd-—pb), (acd) 

bxb, (b-—c)X(c—b), beb 

TEAM PROJECT. Useful Formulas for Three and 


Four Vectors. Prove (13)-(16), which are often useful 
in practical work, and illustrate each formula with two 


I5eed 
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examples. Hint. For (13) choose Cartesian coordinates 
such that d = [d,, 0, 0] and ¢ = [c}, cy, 0]. Show that 
each side of (13) then equals [—baced1, bycod}, 0], and 
give reasons why the two sides are then equal in any 
Cartesian coordinate system. For (14) and (15) use (13). 


(13) b X (ce X d) = (be d)e — (be e)d 

(14) (a X b) X (ec X d) = (a b dec — (a b o)d 
(15) (a X b)*(e X d) = (as e\(bed) — (ae d\(bec) 
(16) (a b c) = (bc a) = (ca b) 


= —(c b a) = —(ac b) 


25-35 


APPLICATIONS 


25. 


26. 


27. 


28. 


29. 


30. 


31. 


32. 


33. 


34. 


35. 


Moment m of a force p. Find the moment vector m 
and m of p = [2, 3, 0] about Q: (2, 1, 0) acting on a 
line through A: (0, 3, 0). Make a sketch. 
Moment. Solve Prob. 25 if p = [1, 0, 3], 
and A: (4, 3, 5). 

Parallelogram. Find the area if the vertices are (4, 2, 
0), (10, 4, 0), (5, 4, 0), and (11, 6, 0). Make a sketch. 
A remarkable parallelogram. Find the area of the 
quadrangle Q whose vertices are the midpoints of 
the sides of the quadrangle P with vertices A: (2, 1, 0), 
B: (5, —1.0), C: (8, 2, 0), and D: (4, 3, 0). Verify that 
Q is a parallelogram. 


Q: (2, 0, 3), 


Triangle. Find the area if the vertices are (0, 0, 1), 
(2, 0, 5), and (2, 3, 4). 

Plane. Find the plane through the points A: (1, 2, 4), 
B: (4, 2, —2), and C: (0, 8, 4). 

Plane. Find the plane through (1, 3, 4), (1, —2, 6), and 
(4, 0, 7). 

Parallelepiped. Find the volume if the edge vectors 
are i + j, —2i + 2k, and —2i — 3k. Make a sketch. 
Tetrahedron. Find the volume if the vertices are 
(, 1, 1), (5, —7, 3), (7, 4, 8), and (10, 7, 4). 
Tetrahedron. Find the volume if the vertices are 
(1, 3, 6), (3, 7, 12), (8, 8, 9), and (2, 2, 8). 
WRITING PROJECT. Applications of Cross 
Products. Summarize the most important applications 
we have discussed in this section and give a few simple 
examples. No proofs. 


9.4 Vector and Scalar Functions and Their Fields. 
Vector Calculus: Derivatives 


Our discussion of vector calculus begins with identifying the two types of functions on which 
it operates. Let P be any point in a domain of definition. Typical domains in applications 
are three-dimensional, or a surface or a curve in space. Then we define a vector function 


v, whose values are vectors, that is, 


v = vV(P) = [v4(P), Va(P), v3(P)] 
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that depends on points P in space. We say that a vector function defines a vector field in 
a domain of definition. Typical domains were just mentioned. Examples of vector fields 
are the field of tangent vectors of a curve (shown in Fig. 195), normal vectors of a surface 
(Fig. 196), and velocity field of a rotating body (Fig. 197). Note that vector functions may 
also depend on time ¢ or on some other parameters. 

Similarly, we define a scalar function f, whose values are scalars, that is, 


f= f(P) 


that depends on P. We say that a scalar function defines a scalar field in that three- 
dimensional domain or surface or curve in space. Two representative examples of scalar 
fields are the temperature field of a body and the pressure field of the air in Earth’s 
atmosphere. Note that scalar functions may also depend on some parameter such as 
time f. 


Notation. If we introduce Cartesian coordinates x, y, z, then, instead of writing v(P) for 
the vector function, we can write 


v(x, y, Z) = [Ui y, z), Vax, y,Z), Ug, y, Z)]. 


Ji 
Ss 


Fig. 195. Field of tangent Fig. 196. Field of normal 
vectors of a curve vectors of a surface 


We have to keep in mind that the components depend on our choice of coordinate system, 
whereas a vector field that has a physical or geometric meaning should have magnitude 
and direction depending only on P, not on the choice of coordinate system. 

Similarly, for a scalar function, we write 


S(P) = f(x y, 2). 


We illustrate our discussion of vector functions, scalar functions, vector fields, and scalar 
fields by the following three examples. 


Scalar Function (Euclidean Distance in Space) 


The distance f(P) of any point P from a fixed point Po in space is a scalar function whose domain of definition 
is the whole space. f(P) defines a scalar field in space. If we introduce a Cartesian coordinate system and P) 
has the coordinates x9, yo, Zo, then fis given by the well-known formula 


f(P) = fix, y, 2) = Vx = x0)? +. — yo? + & = Zo)” 


where x, y, z are the coordinates of P. If we replace the given Cartesian coordinate system with another such 
system by translating and rotating the given system, then the values of the coordinates of P and Pp will in general 
change, but f(P) will have the same value as before. Hence f(P) is a scalar function. The direction cosines of 
the straight line through P and Pp are not scalars because their values depend on the choice of the coordinate 
system. | 
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EXAMPLE 2 


EXAMPLE 3 


Vector Field (Velocity Field) 


At any instant the velocity vectors v(P) of a rotating body B constitute a vector field, called the velocity field 
of the rotation. If we introduce a Cartesian coordinate system having the origin on the axis of rotation, then (see 
Example 5 in Sec. 9.3) 


(1) V(x, y,zZ) = WX r= wx [x,y,z] = w X (i + yj + zk) 


where x, y, z are the coordinates of any point P of B at the instant under consideration. If the coordinates are 
such that the z-axis is the axis of rotation and w points in the positive z-direction, then w = wk and 


ij k 


v=|0 0 | = o[—y, x, 0] = w(—yi + xj). 


x y Zz 


An example of a rotating body and the corresponding velocity field are shown in Fig. 197. 83) 


nm 


— 


<> 


! 
Fig. 197. Velocity field of a rotating body 


Vector Field (Field of Force, Gravitational Field) 


Let a particle A of mass M be fixed at a point Pp and let a particle B of mass m be free to take up various positions 
P in space. Then A attracts B. According to Newton’s law of gravitation the corresponding gravitational force p 
is directed from P to Po, and its magnitude is proportional to 1/r 2. where r is the distance between P and Py, say, 


c 
(2) lpl=—, c = GMm. 
2 


Here G = 6.67: 10-8 cm3/ (g- sec”) is the gravitational constant. Hence p defines a vector field in space. If 
we introduce Cartesian coordinates such that Pp has the coordinates x9, yo, Zo and P has the coordinates x, y, z, 
then by the Pythagorean theorem, 


r= V(x — x0)” + (y= yo" + = Zo” (=0). 
Assuming that r > 0 and introducing the vector 


r= [x-—xo, y- yo. z- Zol = @— Xo)i + (y — Yo)j + & — Zodk, 


we have |r| = r, and (—1/r)r is a unit vector in the direction of p; the minus sign indicates that p is directed 
from P to Pp (Fig. 198). From this and (2) we obtain 


Ip| 1 c X= Xo y— Yo Z= £0 
p Pp rea Fl c inet c a3 c 3 


(3) 


This vector function describes the gravitational force acting on B. isi] 
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: 


Fig. 198. Gravitational field in Example 3 


Vector Calculus 


The student may be pleased to learn that many of the concepts covered in (regular) 
calculus carry over to vector calculus. Indeed, we show how the basic concepts of 
convergence, continuity, and differentiability from calculus can be defined for vector 
functions in a simple and natural way. Most important of these is the derivative of a 
vector function. 


Convergence. An infinite sequence of vectors aj), = 1, 2,-:-, is said to converge if 
there is a vector a such that 


(4) lim |ag,) — al = 0. 
Nn 
a is called the limit vector of that sequence, and we write 
(5) lim Am) = a. 
nn 

If the vectors are given in Cartesian coordinates, then this sequence of vectors converges 
to a if and only if the three sequences of components of the vectors converge to the 
corresponding components of a. We leave the simple proof to the student. 

Similarly, a vector function w(t) of a real variable ¢ is said to have the limit / as t 
approaches fo, if v(t) is defined in some neighborhood of tg (possibly except at fg) and 
(6) lim |v(t) — I| = 0. 

tt 
Then we write 
(7) lim v(t) = 1. 
tto 


Here, a neighborhood of to is an interval (segment) on the f-axis containing fo as an interior 
point (not as an endpoint). 


Continuity. A vector function v(f) is said to be continuous at ft = fo if it is defined in 
some neighborhood of fo (including at fo itself!) and 


(8) fim v(t) = ¥(o). 
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DEFINITION 


If we introduce a Cartesian coordinate system, we may write 
V(t) = [vi (1), V2(), Vs()] = vii + vao(Oj + v3(Ok. 


Then v(f) is continuous at fo if and only if its three components are continuous at fo. 
We now state the most important of these definitions. 


Derivative of a Vector Function 


A vector function v(f) is said to be differentiable at a point ¢ if the following limit 
exists: 


; _ vt + At) — v(t) 
(9) v(t) = dim, we ; 


This vector v’(f) is called the derivative of v(t). See Fig. 199. 


Fig. 199. Derivative of a vector function 


In components with respect to a given Cartesian coordinate system, 
, if i , 
(10) V(@) =[v1, ve, v3]. 


Hence the derivative v' (t) is obtained by differentiating each component separately. For 
instance, if v = [t, t?, O], then v’ = [1, 2r, O]. 

Equation (10) follows from (9) and conversely because (9) is a “vector form” of the 
usual formula of calculus by which the derivative of a function of a single variable is 
defined. [The curve in Fig. 199 is the locus of the terminal points representing v(t) for 
values of the independent variable in some interval containing t and ¢ + Af in (9)]. It 
follows that the familiar differentiation rules continue to hold for differentiating vector 
functions, for instance, 


(cv)’ = cv (c constant), 
(uty) =u'+v 
and in particular 
(11) (uev) =u evtuev 
(12) (ux v)) =u’ xvtuxy’ 


(13) qu v wy) =(@ v wt v wt v w’. 
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The simple proofs are left to the student. In (12), note the order of the vectors carefully 
because cross multiplication is not commutative. 


EXAMPLE 4 __ Derivative of a Vector Function of Constant Length 
Let v(t) be a vector function whose length is constant, say, |v(¢)| = c. Then lv|? =vev=c*, and 
(vev)’ = 2vev’ = 0, by differentiation [see (11)]. This yields the following result. The derivative of a vector 
function v(t) of constant length is either the zero vector or is perpendicular to v(t). || 
Partial Derivatives of a Vector Function 
Our present discussion shows that partial differentiation of vector functions of two or more 
variables can be introduced as follows. Suppose that the components of a vector function 
v= [v1, V2, U3] = vi + Vj + v3k 
are differentiable functions of n variables ty, ---, t,. Then the partial derivative of v with 
respect to t,, is denoted by dv/dt,, and is defined as the vector function 
ov 0Ui. , OU OU 
Ay y 254 38 
Otm Itm Otm Otm 
Similarly, second partial derivatives are 
av a*v1 ‘ ava v3 
At{Itm Itty, Ott, Ot Otm — 
and so on. 
EXAMPLE 5 Partial Derivatives 
‘ 2 or . . P or 
Let r(ty, fg) = acostyi+ asintyj + tg2k. Then —— = —asintyi+acostyj and —— =k. | 


Ory 


Oto - 


Various physical and geometric applications of derivatives of vector functions will be 
discussed in the next sections as well as in Chap. 10. 


PROBLEM SET 9-4 


SCALAR FIELDS IN THE PLANE 


Let the temperature T in a body be independent of z so that 
it is given by a scalar function T = T(x, 1). Identify the 
isotherms 7(x, y) = const. Sketch some of them. 


(a) x? — Ax — y? 
(c) cos x sinh y 

(e) e*siny 
(haa oy 


(b) x?y — y?/3 
(d) sin x sinh y 
(f) e2* cos 2y 

(h) x? — 2x — y? 


9-14 


SCALAR FIELDS IN SPACE 


t= 77-y¥* 2. T = xy 

3. T = 3x — 4y 4, T = arctan (y/x) 

5. T = y/(x? + y”) 6. T = x/(x? + y?) 

7. T = 9x + 4y? 

8. CAS PROJECT. Scalar Fields in the Plane. Sketch 


or graph isotherms of the following fields and describe 
what they look like. 


What kind of surfaces are the level surfaces f(x, y, z) = 


const? 

9, f = 4x — By + 22 
11. f = 5x? + 2y? 
13. f= z— (x7 + y*) 


10. f=907 + y?) 4+ 22 
12. f=z- Vix2+y? 


14. f=x-y? 
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15-20 


VECTOR FIELDS 


Sketch figures similar to Fig. 198. Try to interpet the field 
of v as a velocity field. 


15. 
17. 
19. 
21. 


v=it+j 16. v = —yi + xj 
v= xj 18. v= xi + yj 
v=xi— yj 20. v = yi — xj 


CAS PROJECT. Vector Fields. Plot by arrows: 
(a) v = [x, x" ] (b) v = [1/y, 1/x] 
(c) v = [cos x, sin x] (d) v= ety?) [x, —y] 
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22-25 | DIFFERENTIATION 


22. 


23. 


24. 


25. 


Find the first and second derivatives of r = [3 cos 2r, 
3 sin 2t, 4f]. 


Prove (11)-(13). Give two typical examples for each 
formula. 


Find the first partial derivatives of v, = [e” cos y, 
e” sin y] and vz = [cos x cosh y, —sin x sinh y]. 


WRITING PROJECT. Differentiation of Vector 
Functions. Summarize the essential ideas and facts 
and give examples of your own. 


9.5 Curves. Arc Length. Curvature. Torsion 


Vector calculus has important applications to curves (Sec. 9.5) and surfaces (to be covered 
in Sec. 10.5) in physics and geometry. The application of vector calculus to geometry is 
a field known as differential geometry. Differential geometric methods are applied 
to problems in mechanics, computer-aided as well as traditional engineering design, 
geodesy, geography, space travel, and relativity theory. For details, see [GenRef8] and 


[GenRef9] in App. 1. 


Bodies that move in space form paths that may be represented by curves C. This and 
other applications show the need for parametric representations of C with parameter 1, 
which may denote time or something else (see Fig. 200). A typical parametric representation 


is given by 


(1) r(t) = [x(Q), 


Fig. 200. 


yO), 


z(@] = x@i + yOj + zOk. 


Parametric representation of a curve 


Here ¢ is the parameter and x, y, z are Cartesian coordinates, that is, the usual rectangular 
coordinates as shown in Sec. 9.1. To each value tf = fo, there corresponds a point of C 
with position vector r(tg) whose coordinates are x(to), y(to), Z(to). This is illustrated 


in Figs. 201 and 202. 


The use of parametric representations has key advantages over other representations 
that involve projections into the xy-plane and xz-plane or involve a pair of equations with 
y or with z as independent variable. The projections look like this: 


(2) 


y =f), 


Z= g(a). 


382 


EXAMPLE 1 


EXAMPLE 2 


EXAMPLE 3 
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The advantages of using (1) instead of (2) are that, in (1), the coordinates x, y, z all 
play an equal role, that is, all three coordinates are dependent variables. Moreover, the 
parametric representation (1) induces an orientation on C. This means that as we 
increase t, we travel along the curve C in a certain direction. The sense of increasing 
t is called the positive sense on C. The sense of decreasing ¢ is then called the negative 
sense on C, given by (1). 

Examples 1-4 give parametric representations of several important curves. 


Circle. Parametric Representation. Positive Sense 


The circle x? + y? = 4, z = 0 in the xy-plane with center 0 and radius 2 can be represented parametrically by 
r(t) = [2 cost, 2 sin t, 0] or simply by r(t) = [2 cost, 2 sin ¢] (Fig. 201) 


where 0 St = 277. Indeed, x? + y? = (2.cos 1)? + (2 sin 1)? 4(cos* t + sin? 4) = 4, For t = 0 we have 
r(O) = [2, 0], for t = aq we get ri a) = [0, 2], and so on. The positive sense induced by this representation 
is the counterclockwise sense. 

If we replace t with f* = —t, we have t = —f* and get 


r*(t*) = [2 cos (—f*), 2 sin (—?*)] = [2 cos f*, —2 sin f*]. 


This has reversed the orientation, and the circle is now oriented clockwise. Bo 


Ellipse 


The vector function 
(3) r(t) = [acost, bsint, 0] =acosti+ bsintj (Fig. 202) 


represents an ellipse in the xy-plane with center at the origin and principal axes in the direction of the x- and 
y-axes. In fact, since cos? t + sin? t = 1, we obtain from (3) 


x2 y? i 0 
t Zz 
a® ib? 
If b = a, then (3) represents a circle of radius a. B 
Nd 
(t= 5m) 
(t =n) y 1 
(t = 1) (t= 50) 
2 
\ ‘i z 
(t =0) Nie 
3 ae 
(t= 2) (= 50) =o) 
Fig. 201. Circle in Example 1 Fig. 202. Ellipse in Example 2 


Straight Line 


A straight line L through a point A with position vector a in the direction of a constant vector b (see Fig. 203) 
can be represented parametrically in the form 


(4) r(t)=a-+ tbh = [a + thy, do + the, ag + ths). 


SEC. 9.5 Curves. Arc Length. Curvature. Torsion 383 


If b is a unit vector, its components are the direction cosines of L. In this case, || measures the distance of the 
points of L from A. For instance, the straight line in the xy-plane through A: (3, 2) having slope | is (sketch it) 


rj =, 2, O)+dl, 1, O=Bte 2+4, OL B 


Fig. 203. Parametric representation of a straight line 


A plane curve is a curve that lies in a plane in space. A curve that is not plane is called 
a twisted curve. A standard example of a twisted curve is the following. 


EXAMPLE 4 _ Circular Helix 


The twisted curve C represented by the vector function 


(5) r(t) = [acost, asint, ct] = acosti+ asintj + ctk (c # 0) 


is called a circular helix. It lies on the cylinder xe + y? =a". If c > 0, the helix is shaped like a right-handed 
screw (Fig. 204). If c < 0, it looks like a left-handed screw (Fig. 205). If c = 0, then (5) is a circle. | 


Fig. 204. Right-handed circular helix Fig. 205. Left-handed circular helix 


A simple curve is a curve without multiple points, that is, without points at which the 
curve intersects or touches itself. Circle and helix are simple curves. Figure 206 shows 
curves that are not simple. An example is [sin 2, cost, 0]. Can you sketch it? 

An arc of a curve is the portion between any two points of the curve. For simplicity, 
we say “curve” for curves as well as for arcs. 


Pe Da 


Fig. 206. Curves with multiple points 
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Tangent to a Curve 


The next idea is the approximation of a curve by straight lines, leading to tangents and 
to a definition of length. Tangents are straight lines touching a curve. The tangent to a 
simple curve C at a point P of C is the limiting position of a straight line L through P 
and a point Q of C as Q approaches P along C. See Fig. 207. 

Let us formalize this concept. If C is given by r(t), and P and Q correspond to t and 
t + At, then a vector in the direction of L is 


1 
6 qlt@ + Ad - rl. 


In the limit this vector becomes the derivative 


' peal 
(7) r= ee bs + At) — rr], 


provided r(f) is differentiable, as we shall assume from now on. If r'(t) # 0, we call r’(A) 
a tangent vector of C at P because it has the direction of the tangent. The corresponding 
unit vector is the unit tangent vector (see Fig. 207) 


ae: 
[eas 
‘| 


(8) u 


ae 


Note that both r’ and u point in the direction of increasing t. Hence their sense depends 
on the orientation of C. It is reversed if we reverse the orientation. 
It is now easy to see that the tangent to C at P is given by 


(9) q(w) =r + wr’ (Fig. 208). 


This is the sum of the position vector r of P and a multiple of the tangent vector r’ of C 
at P. Both vectors depend on P. The variable w is the parameter in (9). 


Cc 


Fig. 207. Tangent to a curve Fig. 208. Formula (9) for the tangent to a curve 


Tangent to an Ellipse 
Find the tangent to the ellipse 4x? + y? = 1 at P: (V2, 1/V2). 


Solution. Equation (3) with semi-axes a = 2 and b = | gives r(f) = [2 cost, sin f]. The derivative is 
r'(t) = [—2 sin t, cos f]. Now P corresponds to t = 7/4 because 


r(7/4) = [2 cos (77/4), sin (77/4)] = [V2, 1/V2]. 
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Hence r’ (7/4) = [-v2, 1/V2]. From (9) we thus get the answer 
qw) =[V2, 1/V2] + wI-V2, 1/V2] =[V2d —- w), G/Vv2)0 + w)). 


To check the result, sketch or graph the ellipse and the tangent. B 


Length of a Curve 


We are now ready to define the length / of a curve. / will be the limit of the lengths of 
broken lines of n chords (see Fig. 209, where n = 5) with larger and larger n. For this, 
let r(t),a St Sb, represent C. For each n = 1, 2,---, we subdivide (“partition”) the 
interval a St Sb by points 


to(= a), t1,°°*,tn—-1, tn(= b), where to <ty< + < thy. 


This gives a broken line of chords with endpoints r(to),---, r(t,). We do this arbitrarily 
but so that the greatest |Atm| = |tm— tm—1l approaches 0 as n—>~. The lengths 
11, l2,--: of these chords can be obtained from the Pythagorean theorem. If r(f) has a 
continuous derivative r’(f), it can be shown that the sequence /, /o,--- has a limit, which 
is independent of the particular choice of the representation of C and of the choice of 
subdivisions. This limit is given by the integral 


b 
(10) I= | Vri or’ dt G = a) 
" dt 
l is called the length of C, and C is called rectifiable. Formula (10) is made plausible 
in calculus for plane curves and is proved for curves in space in [GenRef8] listed in App. 1. 
The actual evaluation of the integral (10) will, in general, be difficult. However, some 
simple cases are given in the problem set. 


Arc Length s of a Curve 


The length (10) of a curve C is a constant, a positive number. But if we replace the fixed 
b in (10) with a variable f, the integral becomes a function of ¢, denoted by s(t) and called 
the arc length function or simply the are length of C. Thus 


it 
(11) s(t) = | Vr' er’ dt (v = ) 


Here the variable of integration is denoted by 7 because f is now used in the upper limit. 

Geometrically, s (to) with some to > a is the length of the arc of C between the points 
with parametric values a and tg. The choice of a (the point s = 0) is arbitrary; changing 
a means changing s by a constant. 


ONL 


Fig. 209. Length of a curve 
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Linear Element ds. If we differentiate (11) and square, we have 


ds\ de d a AON LEY 
Ky ria _ yr vin _ (| & y f < 
er (*) “Fao (<) +(2) (2). 


It is customary to write 


(13*) dr = (dx, dy, dz] = dxi + dyj + dzk 
and 
(13) ds” = dre dr = dx? + dy” + dz’. 


ds is called the linear element of C. 


Arc Length as Parameter. The use of s in (1) instead of an arbitrary ¢ simplifies various 
formulas. For the unit tangent vector (8) we simply obtain 


(14) u(s) = r'(s). 


Indeed, |r’(s)| = (ds/ds) = 1 in (12) shows that r’(s) is a unit vector. Even greater 
simplifications due to the use of s will occur in curvature and torsion (below). 


Circular Helix. Circle. Arc Length as Parameter 


The helix r(f) = [acost,asin¢,ct] in (5) has the derivative r’(t) = [—asint,acost,c]. Hence 
rer =a? + Cc, a constant, which we denote by K. Hence the integrand in (11) is constant, equal to K, 
and the integral is s = Kt. Thus t = s/K, so that a representation of the helix with the arc length s as 
parameter is 


s Ss Ss: 368 

(15) rs) r( ) acos—, asin—, . K= Var + c?. 
K K K OK 

A circle is obtained if we set c = 0. Then K = a, t = s/a, and a representation with arc length s as parameter is 


s Ss es 
r*(s) =r acos—, asin—|]. iia 
a a a 


Curves in Mechanics. Velocity. Acceleration 


Curves play a basic role in mechanics, where they may serve as paths of moving bodies. 
Then such a curve C should be represented by a parametric representation r(t) with time 
t as parameter. The tangent vector (7) of C is then called the velocity vector v because, 
being tangent, it points in the instantaneous direction of motion and its length gives the 
speed lv] = |r'|] = Vr’ er’ = ds/dt, see (12). The second derivative of r(f) is called 
the acceleration vector and is denoted by a. Its length |a| is called the acceleration of 
the motion. Thus 


(16) v(t) = r' (0), a(t) = v(t) =r" (t). 
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EXAMPLE 7 


Tangential and Normal Acceleration. Whereas the velocity vector is always tangent 
to the path of motion, the acceleration vector will generally have another direction. We 
can split the acceleration vector into two directional components, that is, 


(17) a = atan + Anorm: 


where the tangential acceleration vector a,j, is tangent to the path (or, sometimes, 0) 
and the normal acceleration vector aj oym is normal (perpendicular) to the path (or, 
sometimes, 0). 

Expressions for the vectors in (17) are obtained from (16) by the chain rule. We first have 


dr _ dr ds 
dt ds dt 


v(t) = = u(s) “ 


where u(s) is the unit tangent vector (14). Another differentiation gives 


2 
d d d du(d. d” 
i) at 7 - {(a(9 #) . du 2 Ws) 


Since the tangent vector u(s) has constant length (length one), its derivative du/ds is 
perpendicular to u(s), from the result in Example 4 in Sec. 9.4. Hence the first term on 
the right of (18) is the normal acceleration vector, and the second term on the right is the 
tangential acceleration vector, so that (18) is of the form (17). 

Now the length |ajan| is the absolute value of the projection of a in the direction of v, 
given by (11) in Sec. 9.2 with b = vy; that is, atan| = lae v|/lvl. Hence atan is this 
expression times the unit vector (1/ lv|)v in the direction of v, that is, 


aev 


(18*) atan = yey’: Also, Anorm = a — Atan- 


We now turn to two examples that are relevant to applications in space travel. 
They deal with the centripetal and centrifugal accelerations, as well as the Coriolis 
acceleration. 


Centripetal Acceleration. Centrifugal Force 


The vector function 
r(t) = [Rceos ot, Resin wt] = Rcoswti+ Rsinotj (Fig. 210) 


(with fixed i and j) represents a circle C of radius R with center at the origin of the xy-plane and describes the 
motion of a small body B counterclockwise around the circle. Differentiation gives the velocity vector 


v=r' =[-Rosinwt, Rw cos wt] = —Rw sin wt i + Rw cos wt j (Fig. 210) 
v is tangent to C. Its magnitude, the speed, is 
lv] = [nr] = Vr' er’ = Ro. 


Hence it is constant. The speed divided by the distance R from the center is called the angular speed. It equals 
w, so that it is constant, too. Differentiating the velocity vector, we obtain the acceleration vector 


(19) a=v'’ =[-Rwcosat, —Rw” sin wf] = —Rw* cos wt i — Rw” sin ot j. 
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Fig. 210. Centripetal acceleration a 


This shows that a = —w2r (Fig. 210), so that there is an acceleration toward the center, called the centripetal 
acceleration of the motion. It occurs because the velocity vector is changing direction at a constant rate. Its 
magnitude is constant, |a| = w*|r| = w?R. Multiplying a by the mass m of B, we get the centripetal force ma. 
The opposite vector —ma is called the centrifugal force. At each instant these two forces are in equilibrium. 
We see that in this motion the acceleration vector is normal (perpendicular) to C; hence there is no tangential 
acceleration. faa 


Superposition of Rotations. Coriolis Acceleration 


A projectile is moving with constant speed along a meridian of the rotating earth in Fig. 211. Find its acceleration. 


Fig. 211. Example 8. Superposition of two rotations 


Solution. Let x, y, z be a fixed Cartesian coordinate system in space, with unit vectors i, j, k in the directions 
of the axes. Let the Earth, together with a unit vector b, be rotating about the z-axis with angular speed w > 0 
(see Example 7). Since b is rotating together with the Earth, it is of the form 


b(t) = cos wti + sin at j. 


Let the projectile be moving on the meridian whose plane is spanned by b and k (Fig. 211) with constant angular 
speed w > 0. Then its position vector in terms of b and k is 


r(t) = Rcos yt b(t) + Rsin ytk (R = Radius of the Earth). 
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We have finished setting up the model. Next, we apply vector calculus to obtain the desired acceleration of the 
projectile. Our result will be unexpected—and highly relevant for air and space travel. The first and second 
derivatives of b with respect to ft are 


b'(t) = —osinwti + wcos wtj 
(20) 
b” (0) = —w* cos wt i — w sin otj = —w"b(t). 


The first and second derivatives of r(¢) with respect to f are 


v=r'(t) =Rcos ytb’ — yRsin yt b + yRcos ytk 
(21) a=v' =Rcosytb” — 2yRsin ytb’ — y*Rcos ytb — y7R sin yt k 


= Rcos ytb" — 2yR sin ytb’ — yr. 


By analogy with Example 7 and because of b” = —w"b in (20) we conclude that the first term in a (involving w 
in b” !) is the centripetal acceleration due to the rotation of the Earth. Similarly, the third term in the last line (involving 
y!) is the centripetal acceleration due to the motion of the projectile on the meridian M of the rotating Earth. 

The second, unexpected term —2yR sin ytb’ in a is called the Coriolis acceleration® (Fig. 211) and is 
due to the interaction of the two rotations. On the Northern Hemisphere, sin yt > 0 (for t > 0; also y > 0 
by assumption), so that ago, has the direction of —b’, that is, opposite to the rotation of the Earth. |agoy| 
is maximum at the North Pole and zero at the equator. The projectile B of mass mg experiences a force 
—Mg cor Opposite to Mg Aeoy, Which tends to let B deviate from M to the right (and in the Southern 
Hemisphere, where sin yt < 0, to the left). This deviation has been observed for missiles, rockets, shells, 
and atmospheric airflow. isi] 


Curvature and Torsion. Optional 


This last topic of Sec. 9.5 is optional but completes our discussion of curves relevant to 
vector calculus. 

The curvature k(s) of a curve C: r(s) (s the arc length) at a point P of C measures the 
rate of change lu’ (s)| of the unit tangent vector u(s) at P. Hence k(s) measures the deviation 
of C at P from a straight line (its tangent at P). Since u(s) = r’ (s), the definition is 


(22) «(s) = |u’(s)| = [r”(s)| (' = d/ds). 


The torsion 7(s) of C at P measures the rate of change of the osculating plane O of 
curve C at point P. Note that this plane is spanned by u and u’ and shown in Fig. 212. 
Hence 7(s) measures the deviation of C at P from a plane (from O at P). Now the rate 
of change is also measured by the derivative b’ of a normal vector b at O. By the definition 
of vector product, a unit normal vector of Oisb = u X (1/«)u’ =u X p.Herep = (1/«)u’ 
is called the unit principal normal vector and b is called the unit binormal vector of C 
at P. The vectors are labeled in Fig. 212. Here we must assume that k # 0; hence k > 0. 
The absolute value of the torsion is now defined by 


(23%) [7(s)| = |b’ (s)I. 


Whereas xk(s) is nonnegative, it is practical to give the torsion a sign, motivated by 
“right-handed” and “left-handed” (see Figs. 204 and 205). This needs a little further 
calculation. Since b is a unit vector, it has constant length. Hence b’ is perpendicular 


3GUSTAVE GASPARD CORIOLIS (1792-1843), French engineer who did research in mechanics. 
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Binormal 


b Normal plane 


Rectifying plane ay 


NG 
oe 
| Osculating plane 


Fig. 212. Trihedron. Unit vectors u, p, b and planes 


to b (see Example 4 in Sec. 9.4). Now b’ is also perpendicular to u because, by the 
definition of vector product, we have b* u = 0,b*u’ = 0. This implies 


(beu)’ =0; that is, b’eut+beu =b’eut+0=0. 


Hence if b’ # 0 at P, it must have the direction of p or —p, so that it must be of the form 
b’ = —7p. Taking the dot product of this by p and using p * p = 1 gives 


(23) T(s) = —p(s) * b'(s). 


The minus sign is chosen to make the torsion of a right-handed helix positive and that of 
a left-handed helix negative (Figs. 204 and 205). The orthonormal vector triple u, p, b is 
called the trihedron of C. Figure 212 also shows the names of the three straight lines in 
the directions of u, p, b, which are the intersections of the osculating plane, the normal 
plane, and the rectifying plane. 


PROBLEM SET 975 


. [4 cos ¢, 4 sin ¢, 32] 
. [cosh ¢, sinh t, 2] 


about the z-axis and the plane z = y. 
17. Circle 5x +y=12=y. 


1-10; PARAMETRIC REPRESENTATIONS FIND A PARAMETRIC REPRESENTATION 
What curves are represented by the following? 11. Circle in the plane z = 1 with center (3, 2) and passing 
Sketch them. through the origin. 

1. [3 + 2 cos ¢, 2 sin t, 0] 12. Circle in the yz-plane with center (4, 0) and passing 

2. [a+tb+ 3t,c— 5f] through (0, 3). Sketch it. 

3. [0, ¢, 1] 13. Straight line through (2, 1, 3) in the direction of i + 2j. 

4. [-2,2 + 5cost, -1+S5sint¢] 14. Straight line through (1, 1, 1) and (4, 0, 2). Sketch it. 

5. [2 + 4cost,1 + sin t, 0] 15. Straight line y = 4x — 1, z = 5x. 

6. [a + 3 cos Tt, b — 2 sin Trt, 0] 16. The intersection of the circular cylinder of radius 1 

7 

8 

9 


. [cos ft, sin 2r, 0] 
. [f, 2, 1/7] 


— 
—] 


18. Helix x? + y? = 25, z = 2 arctan (y/x). 
19. Hyperbola 4x? — 3y” =4,z=-2. 
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20. Intersection of 2x — y + 3z = 2andx + 2y —z=3. 
21. Orientation. Explain why setting t = —f* reverses 
the orientation of [a cos t, a sin t, O]. 
22. CAS PROJECT. Curves. Graph the following more 
complicated curves: 
(a) r(t)=[2cost+ cos 2r,2 sin t — sin 2t] (Steiner’s 
hypocycloid). 
(b) r(t) = [cost + k cos 2t, sin t — k sin 2t] withk = 
10,9, 15, 0.95 1. 
(c) r(t) = [cos t, 
(d) r(t) = [cost¢, sin kt]. For what k’s will it be closed? 
(e) r(t) = [Rsin wt + wRt, Roos wt + R] (cycloid). 
23. CAS PROJECT. Famous Curves in Polar Form. 
Use your CAS to graph the following curves* given in 
polar form p = p(6), po = x74 cre tan 6 = y/x, and 
investigate their form depending on parameters a and b. 


sin 51] (a Lissajous curve). 


p=a0_ Spiral of Archimedes 


p= ae’ Logarithmic spiral 
2a sin? @ Cadaofped 
= —_ ssoid of Diocles 
p eee issoi Oc 
p= 2 he Conchoid of Nicomedes 
cos 0 


p=a/@ Hyperbolic spiral 


3a sin 20 . 
p= a ee Folium of Descartes 
cos” 0 + sin” 0 
sin 30 : ‘ , 
p = 2a Maclaurin’s trisectrix 


sin 20 


p =2acos6 +b Pascal’s snail 


24-28; TANGENT 


Given a curve C:r(f), find a tangent vector r’(f), a unit 
tangent vector u' (1), and the tangent of C at P. Sketch curve 
and tangent. 


24. r(t) = [t, 922, 1], P: (2,2, 1) 

25. r(t) = [10 cost, 1,10 sin t], P: (6, 1, 8) 
26. r(t) = [cos tf, sin t, 9f],  P: (1, 0, 1877) 
27. r(t) = [t,1/t, 0], P: (2,4, 0) 

28. ri) = [407,07], P21 2 


29-32 | LENGTH 


Find the length and sketch the curve. 
29. Catenary r(f) = [t, cosh ¢] from t = 0 tor = 1. 


30. Circular helix r(t) = [4 cos ¢, 4 sin ¢, 5t] from (4, 0, 0) 
to (4, 0, 1077). 
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31. Circle r(t) = [acos t, asin ft] from (a, 0) to (0, a). 
32. Hypocycloid r(t) = [a cos* t, a sin® f], total length. 


33. Plane curve. Show that Eq. (10) implies 
f= fy Vi+ yy’? dx for the length of a plane curve 


C:y = f(x), z = 0, anda = x = b. 


34. Polar coordinates p = Vx? + y, @ = arctan (y/x) 
give 


B 
t= | Vp? + p’ dd, 


Qa 


where p’ = dp/d0. Derive this. Use it to find the total 
length of the cardioid p = a(1 — cos @). Sketch this 
curve. Hint. Use (10) in App. 3.1. 


35-46 | CURVES IN MECHANICS 


Forces acting on moving objects (cars, airplanes, ships, etc.) 
require the engineer to know corresponding tangential and 
normal accelerations. In Probs. 35-38 find them, along 
with the velocity and speed. Sketch the path. 


35. Parabola r (1) = [t, f”, 0]. Find v and a. 

36. Straight line r(‘) = [8¢, 6¢, 0]. Find v and a. 

37. Cycloid r(t) = (Rsin owt + Rt)i + (Rcos wt + R)j. 
This is the path of a point on the rim of a wheel of 


radius R that rolls without slipping along the x-axis. 
Find v and a at the maximum y-values of the curve. 


38. Ellipse r = [cos ¢, 2 sin ¢, OJ. 


39-42 | THE USE OF A CAS may greatly facilitate the 
investigation of more complicated paths, as they occur in 


gear transmissions and other constructions. To grasp the 
idea, using a CAS, graph the path and find velocity, speed, 
and tangential and normal acceleration. 


39. r(t) = [cost + cos 2¢, sin t — sin 21] 
40. r(t) = [2 cost + cos 2t, 2 sint — sin 2¢] 
41. r(t) = [cos ¢, sin 2t, cos 2t] 

42. r(t) = [ct cos ft, ct sin t, ct] (c # 0) 


43. Sun and Earth. Find the acceleration of the Earth 
toward the sun from (19) and the fact that Earth 
revolves about the sun in a nearly circular orbit with 
an almost constant speed of 30 km/s. 


44, Earth and moon. Find the centripetal acceleration 
of the moon toward Earth, assuming that the orbit 
of the moon is a circle of radius 239,000 miles = 
3.85 + 10° m, and the time for one complete revolution 
is 27.3 days = 2.36 - 108s. 


4Named after ARCHIMEDES (c. 287-212 B.c.), DESCARTES (Sec. 9.1), DIOCLES (200 B.c.), 
MACLAURIN (Sec. 15.4), NICOMEDES (250? B.c.) ETIENNE PASCAL (1588-1651), father of BLAISE 


PASCAL (1623-1662). 
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45. 


46. 
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Satellite. Find the speed of an artificial Earth satellite 
traveling at an altitude of 80 miles above Earth’s 
surface, where g = 31 ft/sec”. (The radius of the Earth 
is 3960 miles.) 


Satellite. A satellite moves in a circular orbit 
450 miles above Earth’s surface and completes 
1 revolution in 100 min. Find the acceleration of gravity 
at the orbit from these data and from the radius of Earth 
(3960 miles). 


47-55 


CURVATURE AND TORSION 


47. 


48. 


49. 


Circle. Show that a circle of radius a has curvature 
1/a. 

Curvature. Using (22), show that if C is represented 
by r(¢) with arbitrary 7, then 


Vir’ : r’\(r” a r”) _ (r’ ‘ r’)? 


(22*) 
(r’ 7 r’)?/? 


K(t) = 


Plane curve. Using (22*), show that for a curve 


y =f), 


(22**) K(x) ug ( ie t! ) 
. K(x) = —————_ y =—, etc. }. 
a+ y’2)3/2 : dx 


9.6 Calculus Review: 


Functions of Several Variables. 


50. 


51. 


52. 


53. 


54. 


55. 


Torsion. Using b = u X p and (23), show that (when 
k > 0) 


(23**) r(s)=(u p p= xr” r”)/K 


Torsion. Show that if C is represented by r(f) with 
arbitrary parameter f, then, assuming k > 0 as before, 


(r’ r” r’”) 


(r’ 4 yr” _ r”) = (r’ - rr”)? 


(23*7*) T(t) = 


Helix. Show that the helix [acost, asint, ct] can 
be represented by [acos (s/K), asin(s/K), cs/K], 


where K = Va? +c” and s is the arc length. Show 
that it has constant curvature k = a/K? and torsion 
T = c/K*. 

Find the torsion of C: r(t) = [f, i”, tJ, which looks 
similar to the curve in Fig. 212. 


Frenet® formulas. Show that 
u = Kp, p’ =-—xu+ 7b, b’ = —Tp. 


Obtain « and 7 in Prob. 52 from (22*) and (23***) 
and the original representation in Prob. 54 with 
parameter f. 


Optional 


The parametric representations of curves C required vector functions that depended on a 
single variable x, s, or t. We now want to systematically cover vector functions of several 
variables. This optional section is inserted into the book for your convenience and to make 
the book reasonably self-contained. Go onto Sec. 9.7 and consult Sec. 9.6 only when 
needed. For partial derivatives, see App. A3.2. 


Chain Rules 


Figure 213 shows the notations in the following basic theorem. 


« eae elu, v), yu, | 


Fig. 213. 


K y 


Notations in Theorem 1 


®JEAN-FREDERIC FRENET (1816-1900), French mathematician. 


SEC. 9.6 Calculus Review: Functions of Several Variables. Optional 


THEOREM -1 
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Chain Rule 


Let w = f(x, y, z) be continuous and have continuous first partial derivatives in a 
domain D in xyz-space. Let x = x(u,V), y = yu, Vv), Z = Zu, Vv) be functions that 
are continuous and have first partial derivatives in a domain B in the uv-plane, 
where B is such that for every point (u, v) in B, the corresponding point [x(u, v), 


y(u, V), z(u, Vv) lies in D. See Fig. 213. Then the function 
w = f(x(u, Vv), y(u, Vv), z(u, V)) 


is defined in B, has first partial derivatives with respect to u and v in B, and 


(1) 


dw _ ow ox , ow dy , OW 0Z 
Ou ox du dy Ou 0z Ou 
Ow Ow Ox in Ow Oy . OW OZ 
dv dx dv. dy dv. dz dV 


In this theorem, a domain D is an open connected point set in xyz-space, where 
“connected” means that any two points of D can be joined by a broken line of finitely 
many linear segments all of whose points belong to D. “Open” means that every point P 
of D has a neighborhood (a little ball with center P) all of whose points belong to D. For 
example, the interior of a cube or of an ellipsoid (the solid without the boundary surface) 


is a domain. 


In calculus, x, y, z are often called the intermediate variables, in contrast with the 


independent variables u, v and the dependent variable w. 


Special Cases of Practical Interest 


If w = f(x, y) and x = x(u, v), y = y(u, v) as before, then (1) becomes 


(2) 


Ow _ OW Ox Ow Oy 
Ou Ox Ou oy Ou 
Ow  Owox . Ow doy 
dv. dx dv. dy dV" 


If w = f(x, y, z) and x = x(t), y = y(t), z = 2(0), then (1) gives 


dw dwdx | dowdy 


ow dz 


(3) dt ax dt dy dt 


dz dt 
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EXAMPLE 1 


EXAMPLE 2 
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If w = f(x, y) and x = x(t), y = y(d), then (3) reduces to 


dw dwdx | dw dy 
(4) = ad 5 
dt ox dt ody dt 


Finally, the simplest case w = f(x), x = x(t) gives 


dw _ dw dx 
(3) dt dx dt 


Chain Rule 


If w = x2 — y and we define polar coordinates r, 6 by x = rcos 0, y = rsin 6, then (2) gives 


ow F 2 2 

oe 2x cos 6 — 2y sin @ = 2rcos* 6 — 2r sin” 6 = 2rcos 20 

aw 

- = 2x(—r sin 0) — 2y(r cos 6) = —2r? cos @ sin 8 — 2r7 sin @ cos 6 = —2r? sin 26. iia 


Partial Derivatives on a Surface z = g(x, y) 


Let w = f(x, y, z) and let z = g(x, y) represent a surface S in space. Then on S the function 
becomes 


w(x, y) = fl, y, 8, y)). 


Hence, by (1), the partial derivatives are 


aw F) of a awd of a 
6) he Oe w_ of | of ag 
Ox Ox dz Ox dy dy dz dy 


[z = g(x, y)]. 


We shall need this formula in Sec. 10.9. 


Partial Derivatives on Surface 


Let w =f=x°? + y® + 23 and let z = g = x2 + y”. Then (6) gives 


3x? + 327 - Qn = 3x2 + 3(x? 4 y?)? + 2x, 


By? + 322+ 2y = 3y? + 3(a? + y?)? - 2y, 


We confirm this by substitution, using w(x, y) = xe + y? + (x2 + yy, that is, 


aw aw 
x = 3x7 + 3(x? + y?)? + 2x, = = 3y7 + 3(x? + y?)? + 2y, fal 
x y 
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Mean Value Theorems 


THEOREM 2 Mean Value Theorem 


Let f(x, y, z) be continuous and have continuous first partial derivatives in a 
domain D in xyz-space. Let Po: (Xo, yo, Zo) and P: (xo + h, yo + k, Zo + 1) be 
points in D such that the straight line segment PoP joining these points lies entirely 
in D. Then 


f 


af. af . a 
(7) flo + hyo +k z0 + 1) — fo, Yo, 20) = nn ees I a 


0z 


the partial derivatives being evaluated at a suitable point of that segment. 


Special Cases 


For a function f(x, y) of two variables (satisfying assumptions as in the theorem), formula 
(7) reduces to (Fig. 214) 


0 0 
(8) Flso + yo + 8) ~ flea. yo) = WS + RE, 


and, for a function f(x) of a single variable, (7) becomes 


of 
ar a) = = =, 
(9) TINO) CK eee 
where in (9), the domain D is a segment of the x-axis and the derivative is taken at a 
suitable point between xg and x9 + h. 


(xy +h, Wy +B) 


D 


Coie) 


Fig. 214. Mean value theorem for a function of two variables [Formula (8)] 


9.7 Gradient of a Scalar Field. 
Directional Derivative 


We shall see that some of the vector fields that occur in applications—not all of them!— 
can be obtained from scalar fields. Using scalar fields instead of vector fields is of a 
considerable advantage because scalar fields are easier to use than vector fields. It is the 
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DEFINITION 1 


DEFINITION 2 
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“gradient” that allows us to obtain vector fields from scalar fields, and thus the gradient 
is of great practical importance to the engineer. 


Gradient 


The setting is that we are given a scalar function f(x, y, z) that is defined and 
differentiable in a domain in 3-space with Cartesian coordinates x, y, z. We denote 
the gradient of that function by grad f or Vf (read nabla f). Then the qradient of 
F(x, y, Z) is defined as the vector function 


of of a) _ af . rf af oe 


(1) grad f = Vf = [= ay’ az 


Remarks. For a definition of the gradient in curvilinear coordinates, see App. 3.4. 

As a quick example, if f(x, y, z) = 2y? + 4xz + 3x, then grad f= [4z + 3, 6y”, 4x]. 

Furthermore, we will show later in this section that (1) actually does define a vector. 
The notation Vf is suggested by the differential operator V (read nabla) defined by 


(1*) Ve ae 


Gradients are useful in several ways, notably in giving the rate of change of f(x, y, z) 
in any direction in space, in obtaining surface normal vectors, and in deriving vector fields 
from scalar fields, as we are going to show in this section. 


Directional Derivative 


From calculus we know that the partial derivatives in (1) give the rates of change of 
F(x, y, Z) in the directions of the three coordinate axes. It seems natural to extend this and 
ask for the rate of change of fin an arbitrary direction in space. This leads to the following 
concept. 


Directional Derivative 


The directional derivative Dy f or df/ds of a function f(x, y, z) at a point P in the 
direction of a vector b is defined by (see Fig. 215) 


dj (Q) — f(P) 
(2) Dpf = . = tim” S 


s>0 Ss 


Here Q is a variable point on the straight line L in the direction of b, and |s| is the 
distance between P and Q. Also, s > 0 if Q lies in the direction of b (as in Fig. 215), 
s < Oif Q lies in the direction of —b, and s = 0 if Q = P. 
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EXAMPLE 1 


Fig. 215. Directional derivative 


The next idea is to use Cartesian xyz-coordinates and for b a unit vector. Then the line L 
is given by 


(3) r(s) = x(s)i + y(s)j + z(s)k = po + sb (\b| = 1) 


where po the position vector of P. Equation (2) now shows that Dy f = df/ds is the 
derivative of the function f(x(s), y(s), z(s)) with respect to the arc length s of L. Hence, 
assuming that f has continuous partial derivatives and applying the chain rule [formula 
(3) in the previous section], we obtain 


df of of of 
4 Def =— = —x' +—y' +2’ 
4) of ds ax ay> az” 


where primes denote derivatives with respect to s (which are taken at s = 0). But here, 
differentiating (3) gives r’ = x'i + y'j + z'k = b. Hence (4) is simply the inner product 
of grad f and b [see (2), Sec. 9.2]; that is, 


d 
(5) Dpf = < = be grad f ([b] = 1). 


ATTENTION! | If the direction is given by a vector a of any length (# 0), then 


d 
(5*) Daf == (uae aad 


Gradient. Directional Derivative 
Find the directional derivative of f(x, y, z) = 2x? + 3y? + 2? at P: (2, 1, 3) in the direction of a = [1, 0, —2]. 


Solution. grad f = [4x, 6y, 2z] gives at P the vector grad f(P) = [8, 6, 6]. From this and (5*) we obtain, 
since |a| = V5, 


1 1 
Daf(P) = —=[1, 0, —2] « [8, 6, 6] (8 + 0 — 12) 1.789. 
v5 v5 v5 


The minus sign indicates that at P the function fis decreasing in the direction of a. si 
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Gradient Is a Vector. Maximum Increase 


Here is a finer point of mathematics that concerns the consistency of our theory: grad f 
in (1) looks like a vector—after all, it has three components! But to prove that it actually 
is a vector, since it is defined in terms of components depending on the Cartesian 
coordinates, we must show that grad f has a length and direction independent of the choice 
of those coordinates. See proof of Theorem 1. In contrast, [df/dx, 20f/dy, df/dz] also looks 
like a vector but does not have a length and direction independent of the choice of Cartesian 
coordinates. 

Incidentally, the direction makes the gradient eminently useful: grad f points in the 
direction of maximum increase of f. 


Use of Gradient: Direction of Maximum Increase 


Let f(P) = f(x, y, z) be a scalar function having continuous first partial derivatives 
in some domain B in space. Then grad f exists in B and is a vector, that is, its length 
and direction are independent of the particular choice of Cartesian coordinates. If 
grad f(P) # 0 at some point P, it has the direction of maximum increase of f at P. 


From (5) and the definition of inner product [(1) in Sec. 9.2] we have 
(6) Dp f = |b||grad f| cos y = |grad f| cos y 


where y is the angle between b and grad f: Now fis a scalar function. Hence its value at 
a point P depends on P but not on the particular choice of coordinates. The same holds 
for the arc length s of the line L in Fig. 215, hence also for Dy f; Now (6) shows that Dyf 
is maximum when cos y = 1, y = 0, and then Dy, f = |grad f]. It follows that the length 
and direction of grad f are independent of the choice of coordinates. Since y = 0 if and 
only if b has the direction of grad f, the latter is the direction of maximum increase of f 
at P, provided grad f # 0 at P. Make sure that you understood the proof to get a good 
feel for mathematics. 


Gradient as Surface Normal Vector 


Gradients have an important application in connection with surfaces, namely, as surface 
normal vectors, as follows. Let S be a surface represented by f(x, y, z) = c = const, where 
fis differentiable. Such a surface is called a level surface of f, and for different c we get 
different level surfaces. Now let C be a curve on S through a point P of S. As a curve in 
space, C has a representation r(t) = [x(0), y(4), z()]. For C to lie on the surface S, the 
components of r(f) must satisfy f(x, y, z) = c, that is, 


(7) fa, yO, 2) = ¢. 


Now a tangent vector of C is r’() = ke’, y'(t), z'(t)]. And the tangent vectors of all 
curves on S passing through P will generally form a plane, called the tangent plane of S 
at P. (Exceptions occur at edges or cusps of S, for instance, at the apex of the cone in 
Fig. 217.) The normal of this plane (the straight line through P perpendicular to the tangent 
plane) is called the surface normal to S at P. A vector in the direction of the surface 
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normal is called a surface normal vector of S at P. We can obtain such a vector quite 
simply by differentiating (7) with respect to t. By the chain rule, 


of 
+—y'+—z’ =(gradf)er’ =0. 
ay ay> az (grad f) er 


Hence grad f is orthogonal to all the vectors r’ in the tangent plane, so that it is a normal 
vector of S at P. Our result is as follows (see Fig. 216). 


Tangent plane f= *) 


Fig. 216. Gradient as surface normal vector 


THEOREM 2 Gradient as Surface Normal Vector 


Let f be a differentiable scalar function in space. Let f(x, y, Zz) = c = const represent 
a surface S. Then if the gradient of f at a point P of S is not the zero vector, it is a 
normal vector of S at P. 


EXAMPLE 2. Gradient as Surface Normal Vector. Cone 
Find a unit normal vector n of the cone of revolution z? = A(x? + y?) at the point P: (1, 0, 2). 
Solution. The cone is the level surface f = 0 of f(x, y, 2) = Ax? + y?) — 2”. Thus (Fig. 217) 
grad f = [8x, 8y, —2z], grad f(P) = [8, 0, —4] 


1 1 
” Tarad FOTO = [v5 v5 | 


n points downward since it has a negative z-component. The other unit normal vector of the cone at Pis—n. Mf 


a a 


Fig. 217. Cone and unit normal vector n 
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Vector Fields That Are Gradients 
of Scalar Fields (“Potentials”) 


At the beginning of this section we mentioned that some vector fields have the advantage 
that they can be obtained from scalar fields, which can be worked with more easily. Such 
a vector field is given by a vector function v(P), which is obtained as the gradient of a 
scalar function, say, v(P) = grad f(P). The function f(P) is called a potential function or 
a potential of v(P). Such a v(P) and the corresponding vector field are called conservative 
because in such a vector field, energy is conserved; that is, no energy is lost (or gained) 
in displacing a body (or a charge in the case of an electrical field) from a point P to another 
point in the field and back to P. We show this in Sec. 10.2. 

Conservative fields play a central role in physics and engineering. A basic application 
concerns the gravitational force (see Example 3 in Sec. 9.4) and we show that it has a 
potential which satisfies Laplace’s equation, the most important partial differential 
equation in physics and its applications. 


Gravitational Field. Laplace’s Equation 


The force of attraction 


(8) p=-—r=-e 

r r r r 

between two particles at points Po: (x0, yo, Zo) and P: (x, y, z) (as given by Newton’s 

law of gravitation) has the potential f(x, y, z) = c/r, where r (> 0) is the distance 
between Po and P. 

Thus p = grad f = grad (c/r). This potential f is a solution of Laplace’s equation 


af af of 


(9) VF 
ax? ay” az” 


[Vf (read nabla squared f) is called the Laplacian of f.] 


That distance is r = ((x — xy (y= yo)” (z= gov 2" The key observation now 
is that for the components of p = [p1, Pa, p3] we obtain by partial differentiation 


(10a) 2 (4) —2(x — xo) _xX— Xo 
‘ ax\r) Wae—-xr+o—-wretk-wr2 OP 


and similarly 


(10b) 
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From this we see that, indeed, p is the gradient of the scalar function f = c/r. The second 
statement of the theorem follows by partially differentiating (10), that is, 


a Al 1  3(x— x0)" 
2(+] 3 t 5 ? 


Ox r r r 

ofl). t , 3690)" 
2 3+ 5 > 

oy \r r r 

a? (:) 1, 3@ = 20)" 

az r id r? , 


and then adding these three expressions. Their common denominator is r°. Hence the 
three terms —1/ r? contribute —3r? to the numerator, and the three other terms give 
the sum 


3(x — xo)” + 3(y — yo)” + 3(¢ — 20)” = 37”, 


so that the numerator is 0, and we obtain (9). A 


V7f is also denoted by Af, The differential operator 


(11) TN ee 


(read “nabla squared” or “‘delta’”’) is called the Laplace operator. It can be shown that the 
field of force produced by any distribution of masses is given by a vector function that is 
the gradient of a scalar function f, and f satisfies (9) in any region that is free of matter. 

The great importance of the Laplace equation also results from the fact that there are 
other laws in physics that are of the same form as Newton’s law of gravitation. For instance, 
in electrostatics the force of attraction (or repulsion) between two particles of opposite (or 
like) charge Q, and Qzg is 


(12) is es (Coulomb’s law’). 


Laplace’s equation will be discussed in detail in Chaps. 12 and 18. 
A method for finding out whether a given vector field has a potential will be explained 
in Sec. 9.9. 


®CHARLES AUGUSTIN DE COULOMB (1736-1806), French physicist and engineer. Coulomb’s law was 
derived by him from his own very precise measurements. 
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CALCULATION OF GRADIENTS 


Find grad f. Graph some level curves f = const. Indicate 
Vf by arrows at some points of these curves. 


lL f=@+ DQy-b 
2. f = 9x7 + 4y? 


3. f= y/x 
4. (y + 67 + @ - 47 
5. f=x*+y* 


6. f = (x2 — y?)/(x? + y?) 
USEFUL FORMULAS FOR GRADIENT 
AND LAPLACIAN 
Prove and illustrate by an example. 
7. Vf" = nf” VF 
8. Vi fg) = fVs + sVf 
9. V(f/g) = (1/s°(gVf — f V8) 
10. V?( fe) = gV?f + 2Vf- Vg + fV7e 
USE OF GRADIENTS. ELECTRIC FORCE 


The force in an electrostatic field given by f(x, y, z) has the 

direction of the gradient. Find Vf and its value at P. 

ll. f= xy, P:(-4,5) 

12. f=x/? + y), P21) 

13. f = In(x? + y), P: (8, 6) 

14. f= (2 + y? + 27? P: (12, 0, 16) 

15. f = 4x2 + 9y2 + 27, P:(5,-1,-11) 

16. For what points P:(x,y,z) does Vf with 
f= 25x72 + 9y? + 162” have the direction from P to 
the origin? 

17. Same question as in Prob. 16 when f = 25x2 + 4y?, 


PROBLEM SET 9-7 


18-23| VELOCITY FIELDS 


Given the velocity potential f of a flow, find the velocity 
v = Vf of the field and its value v(P) at P. Sketch v(P) 
and the curve f = const passing through P. 


18. f = x? — 6x — y*, P:(-1,5) 

19. f= cosxcoshy, P: 6 q, In 2) 

20. f=xl+@?% ty), PO) 

21. f= e* cosy, P: (1,47) 

22. At what points is the flow in Prob. 21 directed vertically 
upward? 

23. At what points is the flow in Prob. 21 horizontal? 


24-27; HEAT FLOW 

Experiments show that in a temperature field, heat flows in 

the direction of maximum decrease of temperature 7. Find 

this direction in general and at the given point P. Sketch 
that direction at P as an arrow. 

24. T = 3x? — 2y", P: (2.5, 1.8) 

25. T = 2/(x? + y?), P: (0, 1,2) 

26. T= x2 + y2 +427, P: (2, -1,2) 

27. CAS PROJECT. Isotherms. Graph some curves of 
constant temperature (“isotherms”) and _ indicate 
directions of heat flow by arrows when the temperature 
equals (a) x3 — 3xy, (b) sin x sinh y, and (ce) e”cos y. 

28. Steepest ascent. If z(x,y) = 3000 — x2 — Oy? 
[meters] gives the elevation of a mountain at sea level, 
what is the direction of steepest ascent at P: (4, 1)? 

29. Gradient. What does it mean if |Vf(P)| > |V(Q)| at 

two points P and Q in a scalar field? 


9.8 Divergence of a Vector Field 


Vector calculus owes much of its importance in engineering and physics to the gradient, 
divergence, and curl. From a scalar field we can obtain a vector field by the gradient 
(Sec. 9.7). Conversely, from a vector field we can obtain a scalar field by the divergence 
or another vector field by the curl (to be discussed in Sec. 9.9). These concepts were 
suggested by basic physical applications. This will be evident from our examples. 

To begin, let v(x, y, z) be a differentiable vector function, where x, y, z are Cartesian 
coordinates, and let V1, Vo, U3 be the components of v. Then the function 


Ovy V2 0v3 


(1) div Vv = aF + 


Ox oy 0z 
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THEOREM-—1 


is called the divergence of v or the divergence of the vector field defined by v. For 
example, if 


v = [3xz, 2xy, —yz”] = 3xci + 2xyj — yc?k, then div v = 3z + 2x — 2yz. 
Another common notation for the divergence is 


a0 4 
ax’ dy’ dz 


divv=Vev=| | + Posv2. oI 


0 0 0 
i+ j + —k }° (Yyi + Voj + Usk 
(2 ay! 24) ee, 2J 3k) 


with the understanding that the “product” (0/dx)v, in the dot product means the partial 
derivative 0v,/dx, etc. This is a convenient notation, but nothing more. Note that V* v 
means the scalar div v, whereas Vf means the vector grad f defined in Sec. 9.7. 

In Example 2 we shall see that the divergence has an important physical meaning. 
Clearly, the values of a function that characterizes a physical or geometric property must 
be independent of the particular choice of coordinates. In other words, these values must 
be invariant with respect to coordinate transformations. Accordingly, the following 
theorem should hold. 


Invariance of the Divergence 

The divergence div v is a scalar function, that is, its values depend only on the 
points in space (and, of course, on Vv) but not on the choice of the coordinates in 
(1), so that with respect to other Cartesian coordinates x*, y*, z* and corresponding 
components U4*, Ug*, U3* of V, 


dv, dUS, dU3 
+ + 


2 div v = . 
@) “7 Ox* oy* 0z* 


We shall prove this theorem in Sec. 10.7, using integrals. 

Presently, let us turn to the more immediate practical task of gaining a feel for the 
significance of the divergence. Let f(x, y, z) be a twice differentiable scalar function. Then 
its gradient exists, 


of of of of of, . 
v= and/= | et = i+ —j k 
Ox dy Oz Ox oy 0z 
and we can differentiate once more, the first component with respect to x, the second with 
respect to y, the third with respect to z, and then form the divergence, 
dy v= di Gedy) = —4 Fe a! 
Ox oy 0z 
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Hence we have the basic result that the divergence of the gradient is the Laplacian 
(Sec. 9.7), 


(3) div (grad f) = V°f. 


Gravitational Force. Laplace’s Equation 


The gravitational force p in Theorem 3 of the last section is the gradient of the scalar function f(x, y, z) = c/r, 
which satisfies Laplaces equation Wf = 0. According to (3) this implies that div p = 0 (r > 0). 


The following example from hydrodynamics shows the physical significance of the 
divergence of a vector field. We shall get back to this topic in Sec. 10.8 and add further 
physical details. 


Flow of a Compressible Fluid. Physical Meaning of the Divergence 


We consider the motion of a fluid in a region R having no sources or sinks in R, that is, no points at which 
fluid is produced or disappears. The concept of fluid state is meant to cover also gases and vapors. Fluids in 
the restricted sense, or liquids, such as water or oil, have very small compressibility, which can be neglected in 
many problems. In contrast, gases and vapors have high compressibility. Their density p (= mass per unit volume) 
depends on the coordinates x, y, z in space and may also depend on time ¢. We assume that our fluid is 
compressible. We consider the flow through a rectangular box B of small edges Ax, Ay, Az parallel to the 
coordinate axes as shown in Fig. 218. (Here A is a standard notation for small quantities and, of course, has 
nothing to do with the notation for the Laplacian in (11) of Sec. 9.7.) The box B has the volume AV = Ax Ay Az. 
Let v = [UV1, Vg, V3] = vy yi + Vej + Usk be the velocity vector of the motion. We set 


(4) u = pv = [Wy, Ug, Ug] = uyi + ugj + ugk 


and assume that u and v are continuously differentiable vector functions of x, y, z, and f, that is, they have first 
partial derivatives which are continuous. Let us calculate the change in the mass included in B by considering 
the flux across the boundary, that is, the total loss of mass leaving B per unit time. Consider the flow through 
the left of the three faces of B that are visible in Fig. 218, whose area is Ax Az. Since the vectors vyi and v3 k 
are parallel to that face, the components v1 and v3 of v contribute nothing to this flow. Hence the mass of fluid 
entering through that face during a short time interval Ar is given approximately by 


(pv2)y Ax Az At = (ug)y Ax Az At, 


where the subscript y indicates that this expression refers to the left face. The mass of fluid leaving the box B 
through the opposite face during the same time interval is approximately (12), 44, Ax Az At, where the subscript 
y + Ay indicates that this expression refers to the right face (which is not visible in Fig. 218). The difference 


Au 
Aup Ax Az At = oa AVAL — [Aug = (a)y+ay — Cady 
iy 


is the approximate loss of mass. Two similar expressions are obtained by considering the other two pairs of 
parallel faces of B. If we add these three expressions, we find that the total loss of mass in B during the time 
interval At is approximately 


( Au, Aug Au3 


+ Jay At, 
Ax Ay Az 


where 
Auy = Wy)r+ax — Ue and — Aug = (u3)z4a2 — (U3)z- 
This loss of mass in B is caused by the time rate of change of the density and is thus equal to 


dp 
—AVAt. 
at 
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Box B 


Az 


Zz (x, y, 2) 


Ax 


Fig. 218. Physical interpretation of the divergence 


If we equate both expressions, divide the resulting equation by AV At, and let Ax, Ay, Az, and At approach 
zero, then we obtain 


op 

divu = di == 2. 

ivu iv (pv) 7 
or 


op 
(5) aa + div (pv) = 0. 


This important relation is called the condition for the conservation of mass or the continuity equation of a 
compressible fluid flow. 
If the flow is steady, that is, independent of time, then dp/dt = 0 and the continuity equation is 


(6) div (py) = 0. 


If the density p is constant, so that the fluid is incompressible, then equation (6) becomes 
(7) div v = 0. 


This relation is known as the condition of incompressibility. It expresses the fact that the balance of outflow 
and inflow for a given volume element is zero at any time. Clearly, the assumption that the flow has no sources 
or sinks in R is essential to our argument. v is also referred to as solenoidal. 

From this discussion you should conclude and remember that, roughly speaking, the divergence measures 
outflow minus inflow. ai] 


Comment. The divergence theorem of Gauss, an integral theorem involving the 
divergence, follows in the next chapter (Sec. 10.7). 


PROBLEM SET 978 


1-6 


CALCULATION OF THE DIVERGENCE 


Find div v and its value at P. 

v = [x?, 4y?, 927], P: (-1,0,4] 

v = [0, cos xyz, sin xyz], P: (2, dn, 0] 
v=(x2 + yh, y] everywhere, (b) div v > Oif |z| < 1 and div v < Oif 
V = [0109.2 Volz, x), 030%), PB, 1, DI Ie] > 1. 


Jv = xyz, y, 2] P: (3, —-1,4) 


Lv = (2 + yy? + 24) 3Iy y, 2] 
. For what v3 is v = [e” cos y, e” sin y, v3] solenoidal? 
. Let v = [x, y, v3]. Find avg such that (a) divv > 0 
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PROJECT. Useful Formulas for the Divergence. 
Prove 


(a) div (kv) = kdivv_ (k constant) 

(b) div (fv) =fdivv + ve Vf 

(c) div (fVg) =fVg + Vfe Ve 

(d) div (fVg) — div (gVf) =fV7g — gV°f 

Verify (b) for f= e* and v = axi + byj + cczk. 
Obtain the answer to Prob. 6 from (b). Verify (c) for 


f=x?—y? and g = e**Y. Give examples of your 


own for which (a)-(d) are advantageous. 

CAS EXPERIMENT. Visualizing the Divergence. 
Graph the given velocity field v of a fluid flow in a 
square centered at the origin with sides parallel to the 
coordinate axes. Recall that the divergence measures 
outflow minus inflow. By looking at the flow near the 
sides of the square, can you see whether div v must 
be positive or negative or may perhaps be zero? Then 
calculate div v. First do the given flows and then do 
some of your own. Enjoy it. 


(a) v=i 

(b) v = xi 

(c) v=xi — yj 

(d) v=xi + yj 

(e) v= —xi — yj 


() v= (x? + y*)"(-yi + xp) 
Incompressible flow. Show that the flow with velocity 
vector v = yiis incompressible. Show that the particles 


12. 


14. 


that at time t = O are in the cube whose faces are 
portions of the planes x =0,x=1,y=0,y=1, 
z = 0,z = 1 occupy at t = 1 the volume 1. 


Compressible flow. Consider the flow with velocity 
vector v = xi. Show that the individual particles have 
the position vectors r(f) = cye‘i + coj + c3k with 
constant cy, Co, c3. Show that the particles that at 
t = O are in the cube of Prob. 11 at tf = 1 occupy the 
volume e. 


. Rotational flow. The velocity vector v(x, y, z) of an 


incompressible fluid rotating in a cylindrical vessel is 
of the form v = w Xr, where w is the (constant) 
rotation vector; see Example 5 in Sec. 9.3. Show that 
div v = 0. Is this plausible because of our present 
Example 2? 

Does divu=divv imply u=v or u=v+k 
(k constant)? Give reason. 


15-20 


LAPLACIAN 


Calculate Vf by Eq. (3). Check by direct differentiation. 
Indicate when (3) is simpler. Show the details of your work. 


15. f = cos” x + sin? y 
16. f =e 

17. f = In(x? + y?) 

18. f=z—-— Vx? +¥? 
19. f = 1/(x? + y? + 2?) 
20. f = e?” cosh 2y 


99 Curl of a Vector Field 


The concepts of gradient (Sec. 9.7), divergence (Sec. 9.8), and curl are of fundamental 
importance in vector calculus and frequently applied in vector fields. In this section 
we define and discuss the concept of the curl and apply it to several engineering 


problems. 


Let v(x, y, Z) = [01, Ve, U3] = Uzi + Vej + U3k be a differentiable vector function of 
the Cartesian coordinates x, y, z. Then the curl of the vector function v or of the vector 
field given by v is defined by the “symbolic” determinant 


ij 
) ) ) 
(1) culv=VXv=|— — — 
Ox oy oz 
U1 v9 U3 
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EXAMPLE 1 


EXAMPLE 2 


THEOREM 1 


This is the formula when x, y, z are right-handed. If they are left-handed, the determinant 
has a minus sign in front (just as in (2**) in Sec. 9.3). 

Instead of curl v one also uses the notation rot v. This is suggested by “rotation,” 
an application explored in Example 2. Note that curl v is a vector, as shown in 
Theorem 3. 


Curl of a Vector Function 


Let v = [yz, 3zx, z] = yzi + 3zxj + zk with right-handed x, y, z. Then (1) gives 


i j k 
a a a Pree ie fat 
curl v . 3xi + yj + 3z — 2k 3xi + yj + 2zk. ia 
ox oy Oz 
YZ 3zx ra 


The curl has many applications. A typical example follows. More about the nature and 
significance of the curl will be considered in Sec. 10.9. 


Rotation of a Rigid Body. Relation to the Curl 


We have seen in Example 5, Sec. 9.3, that a rotation of a rigid body B about a fixed axis in space can be 
described by a vector w of magnitude w in the direction of the axis of rotation, where w (>0) is the angular 
speed of the rotation, and w is directed so that the rotation appears clockwise if we look in the direction of w. 
According to (9), Sec. 9.3, the velocity field of the rotation can be represented in the form 


v=wxr 
where r is the position vector of a moving point with respect to a Cartesian coordinate system having the origin 


on the axis of rotation. Let us choose right-handed Cartesian coordinates such that the axis of rotation is the 
z-axis. Then (see Example 2 in Sec. 9.4) 


w=([0, 0, w] = ak, v=wxr=[-oy, ox, 0] = —oyi + wxj. 
Hence 
i j k 
0 to) tc) 
curl v = (0, 0, 2] = 20k = 2w. 
Ox oy 0z 
—wy wx 0 
This proves the following theorem. | 


Rotating Body and Curl 


The curl of the velocity field of a rotating rigid body has the direction of 
the axis of the rotation, and its magnitude equals twice the angular speed of the 
rotation. 


Next we show how the grad, div, and curl are interrelated, thereby shedding further light 
on the nature of the curl. 
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THEOREM 2 Grad, Div, Curl 


Gradient fields are irrotational. That is, if a continuously differentiable vector 
function is the gradient of a scalar function f, then its curl is the zero vector, 


(2) curl(grad f) = 0. 


Furthermore, the divergence of the curl of a twice continuously differentiable vector 
function V is zero, 


(3) div (curl v) = 0. 


PROOF Both (2) and (3) follow directly from the definitions by straightforward calculation. In the 
proof of (3) the six terms cancel in pairs. B 


EXAMPLE 3 Rotational and Irrotational Fields 


The field in Example 2 is not irrotational. A similar velocity field is obtained by stirring tea or coffee in a cup. 
The gravitational field in Theorem 3 of Sec. 9.7 has curl p = 90. It is an irrotational gradient field. ia) 


The term “irrotational” for curl v = 0 is suggested by the use of the curl for characterizing 
the rotation in a field. If a gradient field occurs elsewhere, not as a velocity field, it is 
usually called conservative (see Sec. 9.7). Relation (3) is plausible because of the 
interpretation of the curl as a rotation and of the divergence as a flux (see Example 2 in 
Sec. 9.8). 

Finally, since the curl is defined in terms of coordinates, we should do what we did for 
the gradient in Sec. 9.7, namely, to find out whether the curl is a vector. This is true, as 
follows. 


THEOREM 3 Invariance of the Curl 


curl v is a vector. It has a length and a direction that are independent of the particular 
choice of a Cartesian coordinate system in space. 


PROOF The proof is quite involved and shown in App. 4. 
We have completed our discussion of vector differential calculus. The companion 
Chap. 10 on vector integral calculus follows and makes use of many concepts covered 
in this chapter, including dot and cross products, parametric representation of curves C, 
along with grad, div, and curl. 


PROBLEM SET 9-9 


1. WRITING REPORT. Grad, div, curl. List the 2. (a) What direction does curl v have if v is parallel 


definitions and most important facts and formulas for to the yz-plane? (b) If, moreover, v is independent 
grad, div, curl, and V2. Use your list to write a of x? 
corresponding report of 3-4 pages, with examples of 3. Prove Theorem 2. Give two examples for (2) and (3) 


your own. No proofs. each. 


Chapter 9 Review Questions and Problems 


4-8| CALCULUTION OF CURL 


Find curl v for v given with respect to right-handed 
Cartesian coordinates. Show the details of your work. 


4. v = [2y, 5x, 0] 


5. v = xyzI[x, y, z] 

6. v= (x? + y? + 2) 3/2 [x, y, Z] 
7. v = [0,0,e * sin y] 

8 v= [e*, er. eH} 


FLUID FLOW 


Let v be the velocity vector of a steady fluid flow. Is the 
flow irrotational? Incompressible? Find the streamlines (the 
paths of the particles). Hint. See the answers to Probs. 9 
and 11 for a determination of a path. 

9. v = [0, 32”, 0] 
10. v = [sec x, csc x, 0] 
11. v = [y, —2x, 0] 
12. v = [—y, x, 77] 
13. v = [x, y, -z] 


1. What is a vector? A vector function? A vector field? A 
scalar? A scalar function? A scalar field? Give examples. 

2. What is an inner product, a vector product, a scalar triple 
product? What applications motivate these products? 

3. What are right-handed and left-handed coordinates? 
When is this distinction important? 

4. When is a vector product the zero vector? What is 
orthogonality? 

5. How is the derivative of a vector function defined? 
What is its significance in geometry and mechanics? 

6. If r(t) represents a motion, what are r’(t), |r’ (|, vr" (1), 
and |r” (|? 

7. Can a moving body have constant speed but variable 
velocity? Nonzero acceleration? 

8. What do you know about directional derivatives? Their 
relation to the gradient? 

9. Write down the definitions and explain the significance 
of grad, div, and curl. 

10. Granted sufficient differentiability, which of the 
following expressions make sense? f curl v, v curl fi 
uXv, uXvXxXw, fev, fe(v X Ww), use(Vv X Ww), 
v X curl v, div (fv), curl (fv), and curl (f* v). 


11-19 | ALGEBRAIC OPERATIONS FOR VECTORS 


Let a = [4, 7, 0], b = [3, —1, 5], e = [—6, 2, 0], and d = 
[1, —2, 8]. Calculate the following expressions. Try to 
make a sketch. 


3b°8d, 24d*b, aca 


11. asc, 
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14. PROJECT. Useful Formulas for the Curl. Assuming 
sufficient differentiability, show that 


(a) curl (u + v) = curlu + curl v 

(b) div (curl v) = 0 

(c) curl (fv) = (grad f) X v + fcurl v 
(d) curl (grad f) = 0 

(e) div(u X v) = vecurlu — uecurlyv 


15-20} DIV AND CURL 


With respect to right-handed coordinates, let u = [y, z, x], 
v = [yz, zx, xy], f = xyz,andg = x + y + z. Find the given 
expressions. Check your result by a formula in Proj. 14 
if applicable. 


15. curl (u + v), curl v 

16. curl (gv) 

17. ve curlu, ue curl v, ue curlu 
18. div (u X v) 

19. curl (gu + vy), curl (gu) 

20. div (grad ( fg)) 


CHAPTER-9 REVIEW QUESTIONS AND PROBLEMS 


12.axc, bxd, dxXb, 
13.b xc, cXb, cXe, cre 

14. 5(axb)ec, ae(5bxXc), (5a be), S(arcb)xXe 
15. 6(a X b) Xd, aX 6(bXd), 2a x 3bXd 
16. (1/lal)a, (1/|bl)b, a*b/|bl, a*b/|al 

17. (ab d), (bad), (bd a) 

18. |a+ bl, [al + |b| 

19.axb—bXa, la x bl 


20. Commutativity. When is u X v = v X u? When is 
uev=veu? 


axa 


(aX c)°c¢, 


21. Resultant, equilibrium. Find u such that u and a, b, 
c, d above and w are in equilibrium. 

22. Resultant. Find the most general v such that the resultant 
of v, a, b, ¢ (see above) is parallel to the yz-plane. 

23. Angle. Find the angle between a and c. Between b and 
d. Sketch a and ce. 

24. Planes. Find the angle between the two planes 
Py: 4x — y + 3z = 12and Py: x + 2y + 4z = 4. Make 
a sketch. 

25. Work. Find the work done by q = [5, 2,0] in the 
displacement from (1, 1, 0) to (4, 3, 0). 

26. Component. When is the component of a vector v in 


the direction of a vector w equal to the component of 
w in the direction of v? 


27. Component. Find the component of v = [4, 7, 0] in 
the direction of w = [2, 2, 0]. Sketch it. 
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28. Moment. When is the moment of a force equal to zero? 32-40 | GRAD, DIV, CURL, V”, D\f 


29. Moment. A force p = [4, 2,0] is acting in a line Let f=xy—yz, v= [2y,2z,4x +z], and w= [327 
through (2, 3, 0). Find its moment vector about the x2 — y?, y?]. Find: 


eat US SOE BEE 32. grad f and f grad f at P: (2, 7, 0) 


30. Velocity, acceleration. Find the velocity, speed, 43. div. divw 44 Gu, Ge 


and acceleration of the motion given by r(f) = [3 . 5 5 
cos t, 3 sint,4/] (t= time) at the point P: 3/V2, 3 div(gradf), Vi V'Oxf) 


3/V2, 17). 36. (curl w) « v at (4, 0, 2) 37. grad (div w) 
31. Tetrahedron. Find the volume if the vertices are 38. Dyf at P: C1, 1, 2) 39. Dy f at P: (3, 0, 2) 
(0, 0, 0), (3, 1, 2), (2, 4, 0), (5, 4, 0). 40. v + ((curl w) X v) 


SUMMARY-OF-CHAPTER-9 


Vector Differential Calculus. Grad, Div, Curl 


All vectors of the form a = [ay, dg, a3] = ai + aaj + ag3k constitute the real 
vector space R® with componentwise vector addition 


(1) [ay, dg, a3] + [by, be, bg] = [ay + by, ag + be, ag + bg] 
and componentwise scalar multiplication (c a scalar, a real number) 
(2) clay, do, dg] = [cay, cdg, cas] (Sec. 9.1). 


For instance, the resultant of forces a and b is the sum a + b. 
The inner product or dot product of two vectors is defined by 


(3) a*b = |al|b| cos y = a,b, + agbo + agbs (Sec. 9.2) 
where y is the angle between a and b. This gives for the norm or length |a| of a 


(4) lal = Vaea= Vai + a3 + a3 


as well as a formula for y. If a * b = 0, we call a and b orthogonal. The dot product 
is suggested by the work W = ped done by a force p in a displacement d. 
The vector product or cross product v = a X b is a vector of length 


(5) la X b| = |al|b| sin y (Sec. 9.3) 


and perpendicular to both a and b such that a, b, v form a right-handed triple. In 
terms of components with respect to right-handed coordinates, 


ijk 


(6) aXb=l|q ae a3 (Sec. 9.3). 


Summary of Chapter 9 
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The vector product is suggested, for instance, by moments of forces or by rotations. 
CAUTION! This multiplication is anticommutative, a X b = —b X a, and is not 
associative. 

An (oblique) box with edges a, b, ¢ has volume equal to the absolute value of 
the scalar triple product 


(7) (a b c)=ar(bXec)=(axb)ec. 
Sections 9.49.9 extend differential calculus to vector functions 


v(t) = [v1 (2), vo(t), v3()] = vii + ve(Dj + v3(k 


and to vector functions of more than one variable (see below). The derivative of 
v(t) is 


v(t + At) — v(t) 
At 


° , , y, Ie De , 
lim = [V1, Vg, U3] = vy + Voj + UZ3k. 
At>0 


Differentiation rules are as in calculus. They imply (Sec. 9.4) 


(uev)’ =u'evtuev, (ux v) =u xvtuxv’. 

Curves C in space represented by the position vector r(t) have r’(t) as a tangent 
vector (the velocity in mechanics when f¢ is time), r’(s) (s are length, Sec. 9.5) as 
the unit tangent vector, and \r”(s)| = « as the curvature (the acceleration in 
mechanics). 

Vector functions v(x, y, z) = [U1(x, y, Z), V2(%, y, Z), Ug (% y, Z)] represent vector 
fields in space. Partial derivatives with respect to the Cartesian coordinates x, y, z 
are obtained componentwise, for instance, 


F) OU, OUVg dU OU OU OU 
y | le i je kee. 9.8), 
ox Ox Ox Ox Ox Ox Ox 
The gradient of a scalar function / is 
of of of 
(9) grad f = Vf = roar (Sec. 9.7). 
ax dy dz 
The directional derivative of f in the direction of a vector a is 
df 
(10) Daf =— =—aeVf (Sec. 9.7). 
ds lal 
The divergence of a vector function v is 
vy 0V2 v3 
(11) divv =Vev= + + (Sec. 9.8). 
Ox oy Oz 
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The curl of v is 


i j k 
0 ) ts) 

lv=VXv= Sec. 9.9 

(12) curl v v oe ay = (Sec ) 
Uy v2 U3 


or minus the determinant if the coordinates are left-handed. 
Some basic formulas for grad, div, curl are (Secs. 9.7—9.9) 


V( fg) =fVg + gVf 


(13) 
V(f/g) = C/s*\(eVf — fVe) 
div (fv) = fdivv + ve Vf 
(14) 
div (fVg) = fV’g + Vf* Vg 
Vf = div (V 
(15) ey) 
V'( fa) = aV'f + 2Vf+ Ve + V's 
curl (fv) = Vf X v + fcurlv 
(16) 
div (u X v) = vecurlu — uecurlv 
curl (Vf) = 0 
(17) 


div (curl v) = 0. 


For grad, div, curl, and V? in curvilinear coordinates see App. A3.4. 


ep ep O 


Vector Integral Calculus. 
Integral Theorems 


Vector integral calculus can be seen as a generalization of regular integral calculus. You 
may wish to review integration. (To refresh your memory, there is an optional review 
section on double integrals; see Sec. 10.3.) 

Indeed, vector integral calculus extends integrals as known from regular calculus to 
integrals over curves, called line integrals (Secs. 10.1, 10.2), surfaces, called surface integrals 
(Sec. 10.6), and solids, called triple integrals (Sec. 10.7). The beauty of vector integral 
calculus is that we can transform these different integrals into one another. You do this 
to simplify evaluations, that is, one type of integral might be easier to solve than another, 
such as in potential theory (Sec. 10.8). More specifically, Green’s theorem in the plane 
allows you to transform line integrals into double integrals, or conversely, double integrals 
into line integrals, as shown in Sec. 10.4. Gauss’s convergence theorem (Sec. 10.7) converts 
surface integrals into triple integrals, and vice-versa, and Stokes’s theorem deals with 
converting line integrals into surface integrals, and vice-versa. 

This chapter is a companion to Chapter 9 on vector differential calculus. From Chapter 9, 
you will need to know inner product, curl, and divergence and how to parameterize curves. 
The root of the transformation of the integrals was largely physical intuition. Since the 
corresponding formulas involve the divergence and the curl, the study of this material will 
lead to a deeper physical understanding of these two operations. 

Vector integral calculus is very important to the engineer and physicist and has many 
applications in solid mechanics, in fluid flow, in heat problems, and others. 


Prerequisite: Elementary integral calculus, Secs. 9.7—9.9 
Sections that may be omitted in a shorter course: 10.3, 10.5, 10.8 
References and Answers to Problems: App. | Part B, App. 2 


10.1 Line Integrals 


The concept of a line integral is a simple and natural generalization of a definite integral 


b 


d) | f(x) dx. 


a 


Recall that, in (1), we integrate the function f(x), also known as the integrand, from x = a 
along the x-axis to x = b. Now, ina line integral, we shall integrate a given function, also 
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B 
Ao 


(a) (b) 
Fig. 219. Oriented curve 


called the integrand, along a curve C in space or in the plane. (Hence curve integral 
would be a better name but line integral is standard). 
This requires that we represent the curve C by a parametric representation (as in Sec. 9.5) 


(2) ri) = 2, yO, 2) = xOit VYOJ+2Ok = (stsd). 


The curve C is called the path of integration. Look at Fig. 219a. The path of integration 
goes from A to B. Thus A: r(q) is its initial point and B: r(b) is its terminal point. C is 
now oriented. The direction from A to B, in which ¢ increases is called the positive 
direction on C. We mark it by an arrow. The points A and B may coincide, as it happens 
in Fig. 219b. Then C is called a closed path. 

C is called a smooth curve if it has at each point a unique tangent whose direction varies 
continuously as we move along C. We note that r(#) in (2) is differentiable. Its derivative 
r’(t) = dr/dt is continuous and different from the zero vector at every point of C. 


General Assumption 


In this book, every path of integration of a line integral is assumed to be piecewise smooth, 
that is, it consists of finitely many smooth curves. 


For example, the boundary curve of a square is piecewise smooth. It consists of four 
smooth curves or, in this case, line segments which are the four sides of the square. 


Definition and Evaluation of Line Integrals 


A line integral of a vector function F(r) over a curve C: r(f) is defined by 


de 


b 
(3) | Fe) edr = | F(r(t)) ¢ r'() dt r= 


Cc a 


where r(f) is the parametric representation of C as given in (2). (The dot product was defined 
in Sec. 9.2.) Writing (3) in terms of components, with dr = [dx, dy, dz] as in Sec. 9.5 
and ' = d/dt, we get 


| Fe -ar = | rae + Peay + Pde 
(3') Cc Cc 
b 
= | (Fy x’ sr Foy’ oF F3z') dt. 


a 
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If the path of integration C in (3) is a closed curve, then instead of 


| we also write ; ; 


Cc Cc 


Note that the integrand in (3) is a scalar, not a vector, because we take the dot product. Indeed, 
Fer’ y \r’| is the tangential component of F. (For “component” see (11) in Sec. 9.2.) 

We see that the integral in (3) on the right is a definite integral of a function of ¢ taken 
over the interval a =t=b on the f-axis in the positive direction: The direction of 
increasing t. This definite integral exists for continuous F and piecewise smooth C, because 
this makes F * r’ piecewise continuous. 

Line integrals (3) arise naturally in mechanics, where they give the work done by a 
force F in a displacement along C. This will be explained in detail below. We may thus 
call the line integral (3) the work integral. Other forms of the line integral will be discussed 
later in this section. 


EXAMPLE 1_ Evaluation of a Line Integral in the Plane 


Find the value of the line integral (3) when F(r) = [—y, —xy] yi — xyj and C is the circular arc in Fig. 220 
from A to B. 


Solution. We may represent C by r(t) = [cost,sin¢] = costi+ sintj, where 0 StS 7/2. Then 
x(t) = cost, y(t) = sint, and 


y F r(a)) yt)i — xOy(j = [—sint, cos ¢t sin tf] = —sin ti — cos tsintj. 
B 
By differentiation, r’(t) = [—sint, cost] = —sin ti + cos rj, so that by (3) [use (10) in App. 3.1; set cos t = u 
Cc in the second term] 
17/2 m/2 
| Fear = | [-sin t, —cos ¢ sin ft] * [—sin ¢, cos t] dt = | (sin? ¢ — cos” t sin f) dt 
4 Cc 0 0 
1 « ss ve 7 I 
7 —=(1 — cos 21) dt u*(—du) 0 ~ 0.4521. | 
Fig. 220. Example 1 o 2 1 3 3 


EXAMPLE 2 Line Integral in Space 


The evaluation of line integrals in space is practically the same as it is in the plane. To see this, find the value 
of (3) when F(r) = [z, x, y] = zi + xj + yk and C is the helix (Fig. 221) 


(4) r(t) = [cos ¢, sin t, 3f] = cos tit sintj + 3tk (0St3S27). 


Solution. From (4) we have x(t) = cos t, y(t) = sin t, z(t) = 3t. Thus 


C Bi} F(r()) rf) = Gti + cos tj + sintk) + (—sinti + costj + 3k). 


The dot product is 34(—sin f) + cos” f + 3 sin ¢. Hence (3) gives 


1d 
Sia 
a Qn 
| Fee) ar | (—3t sint + cos?t + 3 sind) dt = 67 + 7 +0=77 ~ 21.99. & 
Fig. 221. Example 2 Cc 0 


Simple general properties of the line integral (3) follow directly from corresponding 
properties of the definite integral in calculus, namely, 


(5a) | kF ¢ dr = ‘| Fear (k constant) 
Cc Cc 
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Fig. 222. 
Formula (5c) 
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(5b) [a+ @rae= [rears [Gear 
(6 Cc Cc 
(5c) | Fear = | Fedr+ | Fedr (Fig. 222) 
Cc Cy Cy 


where in (5c) the path C is subdivided into two arcs C; and Cy that have the same 
orientation as C (Fig. 222). In (5b) the orientation of C is the same in all three integrals. 
If the sense of integration along C is reversed, the value of the integral is multiplied by — 1. 
However, we note the following independence if the sense is preserved. 


Direction-Preserving Parametric Transformations 


Any representations of C that give the same positive direction on C also yield the 
same value of the line integral (3). 


The proof follows by the chain rule. Let r(t) be the given representation with a St Sb 
as in (3). Consider the transformation t = (f*) which transforms the ¢ interval to 
a* = r* S b* and has a positive derivative dt/dt*. We write r(t) = r(@(t*)) = r*(t*). 
Then dt = (dt/dt*) dt* and 


b* 
| F(r*) ¢ dr* = | F(r(f(t*))) * oat 
Cc a* 


: d 
= | Fon) + Fat = | F(n) «dr. z 
a C 


Motivation of the Line Integral (3): 
Work Done by a Force 


The work W done by a constant force F in the displacement along a straight segment d 
is W = F «d; see Example 2 in Sec. 9.2. This suggests that we define the work W done 
by a variable force F in the displacement along a curve C: r(f) as the limit of sums of 
works done in displacements along small chords of C. We show that this definition amounts 
to defining W by the line integral (3). 

For this we choose points to (=a) < ty < +++ < t,(=b). Then the work AW,, done 
by F(r(¢,,)) in the straight displacement from r(f,,) to r(tm+1) is 


AWm = F(t(tm)) * m+.) — P(tm)) ~ FOCm)) * EG Abin (Atm = Atm+1 — tm). 


The sum of these 2 works is W, = AW) +:::+ AW,-}1. If we choose points and 
consider W, for every n arbitrarily but so that the greatest At,, approaches zero as 
n—, then the limit of W, as n—© is the line integral (3). This integral exists 
because of our general assumption that F is continuous and C is piecewise smooth; 
this makes r’(t) continuous, except at finitely many points where C may have corners 
or cusps. a 
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EXAMPLE-3 


EXAMPLE 4 


EXAMPLE 5 


Work Done by a Variable Force 


If F in Example | is a force, the work done by F in the displacement along the quarter-circle is 0.4521, measured 
in suitable units, say, newton-meters (nt - m, also called joules, abbreviation J; see also inside front cover). 
Similarly in Example 2. ia 


Work Done Equals the Gain in Kinetic Energy 


Let F be a force, so that (3) is work. Let f be time, so that dr/dt = v, velocity. Then we can write (3) as 


b 
(6) w= | Fedr= | F(r(2)) * v(t) dt. 


Cc a 


Now by Newton’s second law, that is, force = mass X acceleration, we get 
F = mr" (t) = mv'(t), 


where m is the mass of the body displaced. Substitution into (5) gives [see (11), Sec. 9.4] 


i > yey) m 
W= | my’ *vdt = | n( ) dt lv|? 
f p 2 2 


On the right, mlv|?/ 2 is the kinetic energy. Hence the work done equals the gain in kinetic energy. This is a 
basic law in mechanics. 


t=b 


t=a 


Other Forms of Line Integrals 


The line integrals 


(7) | F, dx, | Fo dy, | Fs dz 


Cc Cc Cc 


are special cases of (3) when F = Fy i or Fj or F3k, respectively. 
Furthermore, without taking a dot product as in (3) we can obtain a line integral whose 
value is a vector rather than a scalar, namely, 


b 


b 
(8) [Fe dt = | F(r(d) dt = | [Fia@), For), Fs] dt. 


Cc 


Obviously, a special case of (7) is obtained by taking Fy = f, Fo = F3 = 0. Then 


b 
(8*) | f(r) dt = | f(r(t)) dt 


Cc a 


with C as in (2). The evaluation is similar to that before. 


A Line Integral of the Form (8) 
Integrate F(r) = [xy, yz, z] along the helix in Example 2. 
Solution. F(r(t)) = [cos t sin t, 3¢ sin t, 34] integrated with respect to ¢ from 0 to 277 gives 


2a 
=[0, —67, 677). |_| 


2a 

1 2 : 39 
| F(r(t) dt = | 20s ft, 3sint— 3tcost, =—1 
6 2 2 0 
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Path Dependence 


Path dependence of line integrals is practically and theoretically so important that we 
formulate it as a theorem. And a whole section (Sec. 10.2) will be devoted to conditions 
under which path dependence does not occur. 


Path Dependence 


The line integral (3) generally depends not only on F and on the endpoints A and 
B of the path, but also on the path itself along which the integral is taken. 


Almost any example will show this. Take, for instance, the straight segment Cy: ry(f) = 
[t, t, 0] and the parabola Co: ro(t) = [f, 0] with O=t=1 (Fig. 223) and integrate 
F = (0, xy, 0]. Then F(r,() ° ri(d) = i F(ra(t)) ° r5(t) = ad so that integration gives 


1/3 and 2/5, respectively. | 
1 B 
C; 
C, 
. iL 
Fig. 223. Proof of Theorem 2 
PROBLEM SET T0-1 
1. WRITING PROJECT. From Definite Integrals to 8. F = [e",coshy, sinhz], C:r = [t, t, t?] from (0, 0, 0) 
Line Integrals. Write a short report (1-2 pages) with to (3, 4, 4). Sketch C. 
examples on line integrals as generalizations of definite _ 8 _ 
integrals. The latter give the area under a curve. Explain 9. F=[x+y,yt+z,z2+x], Cir = (21,5, t] from ¢ = 0 
: a : : to 1. Also from t = —1 to 1. 
the corresponding geometric interpretation of a line 
integral. 10. F = [x, —z, 2y] from (0, 0, 0) straight to (1, 1, 0), then 
to (1, 1, 1), back to (0, 0, 0) 
2-11) DINE INTEGRAL. WORK U1. F = [ee e7*],_ Cir = [4,1?, f] from (0,0, 0) to 
Calculate | F(r) ¢ dr for the given data. If F is a force, this 2,8 2) Skah C. 
Cc 12. PROJECT. Change of Parameter. Path Dependence. 


gives the work done by the force in the displacement along 


C. Show the details. 

2. F = [y?, -x”], C:y = 4x” from (0, 0) to (1, 4) 

3. FasinProb.2, C from (0, 0) straight to (1, 4). Compare. 

4. F = [xy,x7y"],_ C from (2, 0) straight to (0, 2) 

5. F as in Prob. 4, C the quarter-circle from (2, 0) to 
(0, 2) with center (0, 0) 

6. F=[x-y,y-—z,z2—-x], C:r = [2 cost,t,2 sin tf] 
from (2, 0, 0) to (2, 277, 0) 

7. F = [x7,y?, 27], Cir = [cos t, sin, e“] from (1, 0, 1) 


to (1, 0, e2”). Sketch C. 


Consider the integral | F(r) ¢ dr, where F = [xy, —y?). 
Cc 

(a) One path, several representations. Find the value 
of the integral when r = [cost, sin], 0 StS 7/2. 
Show that the value remains the same if you set t = —p 
ort = p? or apply two other parametric transformations 
of your own choice. 

(b) Several paths. Evaluate the integral when C: y = 
x”, thusr = [f,¢"],0 StS 1, wheren = 1, 2, 3,---. 
Note that these infinitely many paths have the same 
endpoints. 
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13. 


(c) Limit. What is the limit in (b) as n > ©? Can you 15-20 | INTEGRALS (8) AND (8*) 
confirm your result by direct integration without referring 
to (b)? 

(d) Show path dependence with a simple example of 
your choice involving two paths. 


Osts4r 


ML-Inequality, Estimation of Line Integrals. Let F 16. f= 3x+y+5z, C:r = [t,coshs, sinh 4], 


be a vector function defined on a curve C. Let |F| be 0 StS 1. Sketch C. 
bounded, say, |F| = MonC, where M is some positive 17 


Evaluate them with F or f and C as follows. 


15. F = [y?,27,x7], Cr = [3 cost, 3 sins, 24], 
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~F=[x+y,yt+zz2+x], C:r=[4cost, sin t, 0], 


number. Show that 


[Pea 


Cc 


(9) 


14. Using (9), find a bound for the absolute value of the 
work W done by the force F = [x”, y] in the dis- 


OStsT7 

18. F = [y¥3, x30], C the hypocycloid r = [cos® 1, 
sin? 7,0], OStS 7/4 

19. f=xyz, Cir = [4f,3¢7, 12t], -2 S182. 
Sketch C. 


= ML (L = Length of C). 


placement from (0, 0) straight to (3, 4). Integrate exactly 20. F = [xz, yz, x2y?], C:r = [t,t, e’], OsrssS. 


and compare. 


Sketch C. 


10.2 Path Independence of Line Integrals 


A 


Fig. 224. Path 
independence 


We want to find out under what conditions, in some domain, a line integral takes on the 
same value no matter what path of integration is taken (in that domain). As before we 
consider line integrals 


(1) | Fe) °dr = fe dx + Fo dy + Fx dz) (dr = [dx, dy, dz)) 
Ci @ 


The line integral (1) is said to be path independent in a domain D in space if for every 
pair of endpoints A, B in domain D, (1) has the same value for all paths in D that begin at 
A and end at B. This is illustrated in Fig. 224. (See Sec. 9.6 for ““domain.”) 

Path independence is important. For instance, in mechanics it may mean that we have 
to do the same amount of work regardless of the path to the mountaintop, be it short and 
steep or long and gentle. Or it may mean that in releasing an elastic spring we get back 
the work done in expanding it. Not all forces are of this type—think of swimming in a 
big round pool in which the water is rotating as in a whirlpool. 

We shall follow up with three ideas about path independence. We shall see that path 
independence of (1) in a domain D holds if and only if: 


(Theorem 1) F = grad f, where grad fis the gradient of f as explained in Sec. 9.7. 
(Theorem 2) Integration around closed curves C in D always gives 0. 
(Theorem 3) curl F = 0, provided D is simply connected, as defined below. 
Do you see that these theorems can help in understanding the examples and counterexample 
just mentioned? 


Let us begin our discussion with the following very practical criterion for path 
independence. 
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Path Independence 


A line integral (1) with continuous Fy, Fo, F3 in a domain D in space is path 
independent in D if and only if F = [Fy, Fo, F3] is the gradient of some function 
fin D, 


(2) F=gradf, thus, Fy = — 


(a) We assume that (2) holds for some function f in D and show that this implies path 
independence. Let C be any path in D from any point A to any point B in D, given by 
r(f) = [x(), yO), ~=z()], where a S t S b. Then from (2), the chain rule in Sec. 9.6, and 
(3’) in the last section we obtain 


af af 
(Fy dx + Fo dy + F3dz) = a+ ay + 3 dz 
ie z 


fee, dx of dy . af a 
aa Oe ae Be ae 


a 


t=b 


_[e, 
= | Ee feo, y(t), 2(0)] 


t=a 


= fa), yb), Ab) — fA@, Wa, aa) 
= f(B) — fl). 


(b) The more complicated proof of the converse, that path independence implies (2) 
for some f, is given in App. 4. a 


The last formula in part (a) of the proof, 


B 
(3) | (Fy dx + Fo dy + F3 dz) = f(B) — f(A) [F = grad f'] 
A 


is the analog of the usual formula for definite integrals in calculus, 


b 
= G(b) — G(a) [G’(x) = g@)]. 


b 
| g(x) dx = G(x) 


a 


a 


Formula (3) should be applied whenever a line integral is independent of path. 


Potential theory relates to our present discussion if we remember from Sec. 9.7 that when 
F = grad f, then f is called a potential of F. Thus the integral (1) is independent of path 
in D if and only if F is the gradient of a potential in D. 
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EXAMPLE=1 


EXAMPLE 2 


THEOREM 2 


C, 


Fig. 225. Proof of 
Theorem 2 


Path Independence 


Show that the integral | Fedr = | (2x dx + 2y dy + 4z dz) is path independent in any domain in space and 
c c 
find its value in the integration from A: (0, 0, 0) to B: (2, 2, 2). 


Solution. ¥ = [2x, 2y, 4z] = grad f, where f = x2 + y? + 2c? because df/dx = 2x = Fy, df/dy = 2y = Fo, 
of/dz = 4z = F3. Hence the integral is independent of path according to Theorem 1, and (3) gives 
F(B) — f(A) = f, 2, 2) — f(0, 0,0) =4+4+4+ 8 = 16. 
If you want to check this, use the most convenient path C:r(f)=[t, 4 ¢],0StS2, on which 
Fr(t) = [2t, 2t, 4], so that F(r(a) « r’(t) = 2t + 2t + 4t = 81, and integration from 0 to 2 gives 8 - 27/2 = 16. 
If you did not see the potential by inspection, use the method in the next example. a 


Path Independence. Determination of a Potential 


Evaluate the integral J = | (3x2 dx + 2yz dy + y? dz) from A: (0, 1, 2) to B: (1, —1, 7) by showing that F has a 
c 
potential and applying (3). 


Solution. If F has a potential f, we should have 
fe = Fy, = 3x”, fy = Fa = 29, fp = Fg =”. 
We show that we can satisfy these conditions. By integration of f,, and differentiation, 


f=x? + gly, 2), fy = 8y = 2yz, gp =yzth®, fax t+y'2+ Ae 


= y? +h’ = y?, h' =0 h=0,_ say. 


This gives f(x,y,z) = x° + y*z and by (3), 


1=f(, -1,7) — f, 1,2) =1+7-0+2)=6. a 


Path Independence and Integration 
Around Closed Curves 


The simple idea is that two paths with common endpoints (Fig. 225) make up a single 
closed curve. This gives almost immediately 


Path Independence 


The integral (1) is path independent in a domain D if and only if its value around 
every closed path in D is zero. 


If we have path independence, then integration from A to B along Cy and along C2 in 
Fig. 225 gives the same value. Now C and C2 together make up a closed curve C, and 
if we integrate from A along C, to B as before, but then in the opposite sense along Cz 
back to A (so that this second integral is multiplied by —1), the sum of the two integrals 
is zero, but this is the integral around the closed curve C. 

Conversely, assume that the integral around any closed path C in D is zero. Given any 
points A and B and any two curves Cy and C2 from A to B in D, we see that C; with the 
orientation reversed and Cz together form a closed path C. By assumption, the integral 
over C is zero. Hence the integrals over C, and Co, both taken from A to B, must be equal. 
This proves the theorem. a 
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Work. Conservative and Nonconservative (Dissipative) Physical Systems 


Recall from the last section that in mechanics, the integral (1) gives the work done by a 
force F in the displacement of a body along the curve C. Then Theorem 2 states that work 
is path independent in D if and only if its value is zero for displacement around every 
closed path in D. Furthermore, Theorem | tells us that this happens if and only if F is the 
gradient of a potential in D. In this case, F and the vector field defined by F are called 
conservative in D because in this case mechanical energy is conserved; that is, no work 
is done in the displacement from a point A and back to A. Similarly for the displacement 
of an electrical charge (an electron, for instance) in a conservative electrostatic field. 

Physically, the kinetic energy of a body can be interpreted as the ability of the body to 
do work by virtue of its motion, and if the body moves in a conservative field of force, 
after the completion of a round trip the body will return to its initial position with the 
same kinetic energy it had originally. For instance, the gravitational force is conservative; 
if we throw a ball vertically up, it will (if we assume air resistance to be negligible) return 
to our hand with the same kinetic energy it had when it left our hand. 

Friction, air resistance, and water resistance always act against the direction of motion. 
They tend to diminish the total mechanical energy of a system, usually converting it into 
heat or mechanical energy of the surrounding medium (possibly both). Furthermore, 
if during the motion of a body, these forces become so large that they can no longer 
be neglected, then the resultant force F of the forces acting on the body is no longer 
conservative. This leads to the following terms. A physical system is called conservative 
if all the forces acting in it are conservative. If this does not hold, then the physical system 
is called nonconservative or dissipative. 


Path Independence and Exactness 
of Differential Forms 


Theorem | relates path independence of the line integral (1) to the gradient and Theorem 2 
to integration around closed curves. A third idea (leading to Theorems 3* and 3, below) 
relates path independence to the exactness of the differential form or Pfaffian form' 


(4) Fe dr = Fi dx + Fody + Fadz 


under the integral sign in (1). This form (4) is called exact in a domain D in space if it 
is the differential 


of of of 
df = — dx + —dy + — dz = (gradf)* dr 
Ox oy Oz 


of a differentiable function f(x, y, z) everywhere in D, that is, if we have 
Fe dr = df. 


Comparing these two formulas, we see that the form (4) is exact if and only if there is a 
differentiable function f (x, y, z) in D such that everywhere in D, 


_f , f a 


5 F= df, thus, Fy = ; = - = : 
(5) grad f, us l= ae 2 = oy ao 


1JOHANN FRIEDRICH PFAFF (1765-1825). German mathematician. 
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THEOREM 3* 


THEOREM 3 


PROOF 


Hence Theorem | implies 


Path Independence 


The integral (1) is path independent in a domain D in space if and only if the differential 
form (4) has continuous coefficient functions Fy, F, F3 and is exact in D. 


This theorem is of practical importance because it leads to a useful exactness criterion. 
First we need the following concept, which is of general interest. 

A domain D is called simply connected if every closed curve in D can be continuously 
shrunk to any point in D without leaving D. 

For example, the interior of a sphere or a cube, the interior of a sphere with finitely many 
points removed, and the domain between two concentric spheres are simply connected. On 
the other hand, the interior of a torus, which is a doughnut as shown in Fig. 249 in Sec. 10.6 
is not simply connected. Neither is the interior of a cube with one space diagonal removed. 

The criterion for exactness (and path independence by Theorem 3%) is now as follows. 


Criterion for Exactness and Path Independence 


Let F, Fs, F3 in the line integral (1), 


| Fe) ‘dr = ie dx + Fo dy + Fx3 dz), 
Cc Cc 
be continuous and have continuous first partial derivatives in a domain D in space. Then: 


(a) If the differential form (A) is exact in D—and thus (1) is path independent 
by Theorem 3*—, then in D, 


(6) curl F = 0; 


in components (see Sec. 9.9) 


“7 OF3 OF 5 OF, OF3 OF > OFy 
(@) dy az’ az ax” ax ay | 

(b) Jf (6) holds in D and D is simply connected, then (4) is exact in D—and 
thus (1) is path independent by Theorem 3*. 


(a) If (4) is exact in D, then F = gradf in D by Theorem 3*, and, furthermore, 
curl F = curl (grad f) = 0 by (2) in Sec. 9.9, so that (6) holds. 
(b) The proof needs “Stokes’s theorem” and will be given in Sec. 10.9. o 


Line Integral in the Plane. For | F(r) ¢ dr = | (Fy dx + F2 dy) the curl has only one 
C C 
component (the z-component), so that (6’) reduces to the single relation 


OFg OF, 
(6 i) as Se eens 
Ox oy 


(which also occurs in (5) of Sec. 1.4 on exact ODEs). 
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EXAMPLE 3. Exactness and Independence of Path. Determination of a Potential 
Using (6’), show that the differential form under the integral sign of 


l= | [2xyz? dx + (x22? + z cos yz) dy + (2x2yz + y cos yz) dz] 
c 


is exact, so that we have independence of path in any domain, and find the value of J from A: (0, 0, 1) to 
B: (1, 77/4, 2). 


Solution. Exactness follows from (6’), which gives 
g 


(F3)y 2x22 + cos yz — yz sin yz = (F9)- 
(Fz = Axyz = (F3) 


(Fo)y = 2x2” = (Fry. 


To find f, we integrate Fz (which is “long,” so that we save work) and then differentiate to compare with F, and F3, 


f |r dy [ore + z cos yz) dy = xezty + sin yz + g(x, z) 


fo = 2xz7y + ge = Fy = 2xye*, ge = 0, g = he) 


fe = 2x?zy + ycos yz t+ h! = Fy = 2x?zy + y cos yz, h' =0. 


h' = 0 implies h = const and we can take h = 0, so that g = 0 in the first line. This gives, by (3), 


T T 
f(y. 2) =x7yz2 + sinyz, f(B) — f(A) = 1+ ; -4+ sin ; O=7+1. aa] 


The assumption in Theorem 3 that D is simply connected is essential and cannot be omitted. 
Perhaps the simplest example to see this is the following. 


EXAMPLE 4 _ Onthe Assumption of Simple Connectedness in Theorem 3 


Let 


y x 
(7) Fy =- : Fo = ’ F3 = 0. 
x2 4: y? x2 4s y? 


Differentiation shows that (6’) is satisfied in any domain of the xy-plane not containing the origin, for example, 


in the domain D: 3 <Vx74+ y? < 3 shown in Fig. 226. Indeed, Fy and Fz do not depend on z, and F3 = 0, 
so that the first two relations in (6’) are trivially true, and the third is verified by differentiation: 


OF 2 xe + y —x-2x y? — x? 

ax (x2 + ye (2 + yy’ 
OF x? + y2 — y+ 2y y? — x? 
ay (x2 + yy)? e? + py)?” 


Clearly, D in Fig. 226 is not simply connected. If the integral 


—ydx + xdy 


l= | (Fy dx + Fa) = | 5 5 
Cc c x ty 


were independent of path in D, then J = 0 on any closed curve in D, for example, on the circle xe + y = 1. 
But setting x = rcos 6, y = r sin @ and noting that the circle is represented by r = 1, we have 


x = cos 6, dx = —sin 6 dé, y = sin 6, dy = cos 0 dé, 
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so that —y dx + x dy = sin? 6 d@ + cos” @.d0 = dé and counterclockwise integration gives 


20 
dé 
r= | — =27. 
oC 


Since D is not simply connected, we cannot apply Theorem 3 and cannot conclude that / is independent of path 


in D. 
Although F = grad f, where f = arctan (y/x) (verify!), we cannot apply Theorem 1 either because the polar 
angle f = 6 = arctan (y/x) is not single-valued, as it is required for a function in calculus. a 


rolw 


Fig. 226. Example 4 


PROBLEM SET 10-2 


1. WRITING PROJECT. Report on Path Independence. G, 7,3) 
Make a list of the main ideas and facts on path 8. | (cos yz dx — xz sin yz dy — xy sin yz dz) 
independence and dependence in this section. Then 6, 3,7) 


work this list into a report. Explain the definitions and 
the practical usefulness of the theorems, with illustrative 
examples of your own. No proofs. 


0, D 
9. | (e” cosh y dx + (e” sinh y + e*cosh y) dy 
0, 1,0) 
+ e* sinh y dz) 
10. PROJECT. Path Dependence. (a) Show _ that 


2. On Example 4. Does the situation in Example 4 of the 


text change if you take the domain 0 < x2 + y? < 


2? 
a l= | (xy dx + 2xy” dy) is path dependent in the 


3-9| PATH INDEPENDENT INTEGRALS eC 


xy-plane. 


Show that the form under the integral sign is exact in the 
plane (Probs. 3-4) or in space (Probs. 5—9) and evaluate the 
integral. Show the details of your work. 


(b) Integrate from (0, 0) along the straight-line 
segment to (1, b), 0 = b S 1, and then vertically up to 
(1, 1); see the figure. For which b is J maximum? What 


(7, 0) 
3. | G cos ax cos 2y dx — 2 sin Hx sin 2y dy) is its maximum value? 
(11/2, 7) (c) Integrate / from (0, 0) along the straight-line segment 
(6, D to (c, 1), 0 S c S 1, and then horizontally to (1, 1). For 
4. | eM (2x dx + 4x? dy) c = 1, do you get the same value as for b = 1 in (b)? 
(4, 0) For which c is / maximum? What is its maximum value? 
(2, 1/2, 7/2) y 
5. | ey sin z dx + x sin zdy + cos z dz) 1 


(0, 0, 77) 


d, 1,0) 
ae 2 2 
6. | ev TY" x dx + ydy + zdz) 
(0,0, 0) 


a1, D 
7. | (yz sinh xz dx + cosh xz dy + xy sinh xz dz) ye) 


(0, 2, 3) Project 10. Path Dependence 
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11. 


12. 
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On Example 4. Show that in Example 4 of the text, 
F = grad (arctan (y/x)). Give examples of domains in 
which the integral is path independent. 


CAS EXPERIMENT. Extension of Project 10. Inte- 
grate xy dx + 2xy" dy over various circles through the 
points (0, 0) and (1, 1). Find experimentally the smallest 
value of the integral and the approximate location of 
the center of the circle. 


13-19; PATH INDEPENDENCE? 


14. 
15. 
16. 
17. 
18. 
19. 


(sinh xy) (z dx — x dz) 

xy dx — Axy? dy + 82x dz 

e4 dx + (xe¥ — e*) dy — ye* dz 

4y dx + zdy + (y — 22) daz 

(cos xy)(yz dx + xz dy) — 2 sin xy dz 

(cos (x? + 2y? + z2)) (2x dx + 4y dy + 2z dz) 


Check, and if independent, integrate from (0, 0, 0) to (a, b, c). 
13. 2e*’(x cos 2y dx — sin 2y dy) 


20. Path Dependence. Construct three simple examples 
in each of which two equations (6’) are satisfied, but 
the third is not. 


10.3 Calculus Review: Double Integrals. 


Optional 


This section is optional. Students familiar with double integrals from calculus should 
skip this review and go on to Sec. 10.4. This section is included in the book to make it 
reasonably self-contained. 

In a definite integral (1), Sec. 10.1, we integrate a function f(x) over an interval 
(a segment) of the x-axis. In a double integral we integrate a function f(x, y), called the 
integrand, over a closed bounded region” R in the xy-plane, whose boundary curve has a 
unique tangent at almost every point, but may perhaps have finitely many cusps (such as 
the vertices of a triangle or rectangle). 

The definition of the double integral is quite similar to that of the definite integral. We 
subdivide the region R by drawing parallels to the x- and y-axes (Fig. 227). We number the 
rectangles that are entirely within R from 1| to n. In each such rectangle we choose a point, 
say, (X~, Vx) in the kth rectangle, whose area we denote by AA;,. Then we form the sum 


Jn = > fr Ve) AAg. 
k=1 


Fig. 227. Subdivision of a region R 


24 region R is a domain (Sec. 9.6) plus, perhaps, some or all of its boundary points. R is closed if its boundary 
(all its boundary points) are regarded as belonging to R; and R is bounded if it can be enclosed in a circle of 
sufficiently large radius. A boundary point P of R is a point (of R or not) such that every disk with center P 
contains points of R and also points not of R. 
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This we do for larger and larger positive integers n in a completely independent manner, 
but so that the length of the maximum diagonal of the rectangles approaches zero as n 
approaches infinity. In this fashion we obtain a sequence of real numbers Jy, Jn,,° °° - 
Assuming that f(x, y) is continuous in R and R is bounded by finitely many smooth curves 
(see Sec. 10.1), one can show (see Ref. [GenRef4] in App. 1) that this sequence converges 
and its limit is independent of the choice of subdivisions and corresponding points 
(Xx, Yx). This limit is called the double integral of f(x, y) over the region R, and is 
denoted by 


| |e. y) dx dy or | |e. y) dA. 
R 


R 


Double integrals have properties quite similar to those of definite integrals. Indeed, for 
any functions f and g of (x, y), defined and continuous in a region R, 


| ara dy = ‘| |ravay (k constant) 
R R 
(1) | [ur @acay = | [ravay + | feacas 
R R R 
| [rea = | fracas [ |ravay (Fig. 228). 


Furthermore, if R is simply connected (see Sec. 10.2), then there exists at least one point 
(xo, Yo) in R such that we have 


(2) | lr (x, y) dx dy = f(Xo, Yo)A, 


R 


where A is the area of R. This is called the mean value theorem for double integrals. 


Fig. 228. Formula (1) 


Evaluation of Double Integrals 
by Two Successive Integrations 


Double integrals over a region R may be evaluated by two successive integrations. We 
may integrate first over y and then over x. Then the formula is 


b h(x) 
(3) | [re y) dx dy = | | f(x, y) “| dx (Fig. 229). 


R a gta) 
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Here y = g(x) and y = A(x) represent the boundary curve of R (see Fig. 229) and, keeping 
x constant, we integrate f(x, y) over y from g(x) to h(x). The result is a function of x, and 
we integrate it from x = a to x = b (Fig. 229). 

Similarly, for integrating first over x and then over y the formula is 


d- ray) 
(4) | [ressaeay = | | f(x, y) as| dy (Fig. 230). 


R w ply) 


oO 


I 
I 
| 
I 
| 
a 6 x x 


Fig. 229. Evaluation of a double integral Fig. 230. Evaluation of a double integral 


The boundary curve of R is now represented by x = p(y) and x = q(y). Treating y as a 
constant, we first integrate f(x, y) over x from p(y) to qg(y) (see Fig. 230) and then the 
resulting function of y from y = c to y = d. 

In (3) we assumed that R can be given by inequalities a = x S band g(x) Sy SA). 
Similarly in (4) by c S y S dand p(y) = x S q(y). If aregion R has no such representation, 
then, in any practical case, it will at least be possible to subdivide R into finitely many 
portions each of which can be given by those inequalities. Then we integrate f(x, y) over 
each portion and take the sum of the results. This will give the value of the integral of 
S(%, y) over the entire region R. 


Applications of Double Integrals 


Double integrals have various physical and geometric applications. For instance, the area 
A of a region R in the xy-plane is given by the double integral 


A= | [ax dy. 
R 
The volume V beneath the surface z = f(x, y) (> 0) and above a region R in the xy-plane 
is (Fig. 231) 
= | [ren dx dy 
R 


because the term f(x, yx,)AAx in Jy, at the beginning of this section represents the volume 
of a rectangular box with base of area AA;, and altitude f(xx, yx). 
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F(x, y) 


Fig. 231. Double integral as volume 


As another application, let f(x, y) be the density (= mass per unit area) of a distribution 
of mass in the xy-plane. Then the total mass M in R is 


M= | [res aay 


R 


the center of gravity of the mass in R has the coordinates x, y, where 


_ 1 _ 1 
x= wf), pes y) dx dy and y= uf), pres y) dx dy; 


the moments of inertia J, and J, of the mass in R about the x- and y-axes, respectively, are 


Ip = | | y*f(x, y)dxdy, ly = | | x(x, y) dx dy; 


R R 


and the polar moment of inertia /) about the origin of the mass in R is 


Ig = Ip + Ty = | fo? + y) f(x, y) dx dy. 
R 


An example is given below. 


Change of Variables in Double Integrals. Jacobian 


Practical problems often require a change of the variables of integration in double integrals. 
Recall from calculus that for a definite integral the formula for the change from x to u is 


b 


B 
(5) | Fx) dx = | flow) & du 
u 


a a 


Here we assume that x = x(u) is continuous and has a continuous derivative in some 
interval a S u S B such that x(a) = a, x(B) = b[or x(a) = b, x(B) = a] and x(u) varies 
between a and b when u varies between a and B. 

The formula for a change of variables in double integrals from x, y to u, U is 


a(x, y) 
O(u, V) 


du dv; 


(6) | |e. y) dx dy = | [row v), y(u, V)) 


i 
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that is, the integrand is expressed in terms of u and v, and dx dy is replaced by du du times 
the absolute value of the Jacobian* 


Bs TEE 
_ Oy) du dV) _ ax ay ax dy 
~ a(u,v) lay dy! audv dv au’ 
au av 


(7) 


Here we assume the following. The functions 
xX = x(u, V), y = yu, v) 


effecting the change are continuous and have continuous partial derivatives in some region 
R* in the wv-plane such that for every (u, v) in R* the corresponding point (x, y) lies in 
R and, conversely, to every (x, y) in R there corresponds one and only one (u, v) in R*; 
furthermore, the Jacobian J is either positive throughout R* or negative throughout R*. 
For a proof, see Ref. [GenRef4] in App. 1. 


Change of Variables in a Double Integral 


Evaluate the following double integral over the square R in Fig. 232. 
[Je + y?) dx dy 
R 


Solution. The shape of R suggests the transformation x + y = u,x —y =v. Then x = 3(u + v), 
y = 3(u — v). The Jacobian is 


any) |e 3 1 
av) JE —2 2 
R corresponds to the square 0 S u = 2,0 Sv 3S 2. Therefore, 
ary 1 8 
| fo? + y?) dx dy | | - (u2 + v?) A du dv : tea] 


R 0-0 = 


Fig. 232. Region R in Example 1 


3Named after the German mathematician CARL GUSTAV JACOB JACOBI (1804-1851), known for his 
contributions to elliptic functions, partial differential equations, and mechanics. 
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Of particular practical interest are polar coordinates r and 6, which can be introduced 
by setting x = rcos@, y=rsin@. Then 


a(x, y) cos@ -—rsin@ 


II 
= 


ar,9)  |sin@ rcos 0 


and 


(8) | |r (x, y) dx dy = | 
R 


[re cos 6, r sin 0) r dr d@ 
R* 


where R* is the region in the 7@-plane corresponding to R in the xy-plane. 


EXAMPLE 2 _ Double Integrals in Polar Coordinates. Center of Gravity. Moments of Inertia 


Let f(x, y) = 1 be the mass density in the region in Fig. 233. Find the total mass, the center of gravity, and the 


= moments of inertia /,,, Ly, Io. 
Solution. We use the polar coordinates just defined and formula (8). This gives the total mass 
m/2 rl TA) aT 
1 «x M= dx dy = r dr d@ = oe ae 
R 0 0 0 
Fig. 233. 
Example 2 


The center of gravity has the coordinates 


4 m/2 71 4 m/24 4 
x= <| | rcos 6 rdr dé | cos 6 d6 0.4244 
WS dp Ti, 3 30 


a 


for reasons of symmetry. 
30 


y = 
The moments of inertia are 


m/2 1 1/2 1 
Ly, = | [>?acay = { | r? sin? Or dr dd = | — sin? 6 d0 
R 0 eR 


0 
m2 y 1/7 T 

= —(1 — cos 20) d6 0 0.1963 
» 8 8\2 16 


7 7 
lI, = 16 for reasons of symmetry, Ip = 1, + ly = = = 0.3927. 
Why are x and y less than 3? i] 


This is the end of our review on double integrals. These integrals will be needed in this 
chapter, beginning in the next section. 
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PROBLEM SET 10-3 


1. Mean value theorem. Illustrate (2) with an example. 14. y 


2-8| DOUBLE INTEGRALS 


Describe the region of integration and evaluate. 


2 -2u 
2. | | (x + y)* dy dx 


0 “x 


3 ry 
is | | (x2 + y) dx dy "1 a 
Bes 15. i 


4. Prob. 3, order reversed. 


v7) 


a) 


1px 
. | fa — 2xy) dy dx 


0 ~2 


a 


2 ry 
[ [sim »aeas 
0-0 


7. Prob. 6, order reversed. 16. 


7/4 pcosy 
8. | | x sin y dx dy 


0 0 


9-11} VOLUME 


Find the volume of the given region in space. 


9. The region beneath z = 4x7 + 9y? and above the m 


rectangle with vertices (0, 0), (3, 0), (3, 2), (0, 2) in the 

an MOMENTS OF INERTIA 
10. The first octant region bounded by the coordinate planes 

and the surfaces y = 1 — x? z= 1 — x”. Sketch it. 
11. The region above the xy-plane and below the parabo- 

loid z= 1 — (x2 + y?). 


Find [,, ly, 19 of a mass of density f(x, y) = 1 in the region 
R in the figures, which the engineer is likely to need, along 
with other profiles listed in engineering handbooks. 


17. R as in Prob. 13. 
18. R as in Prob. 12. 


12-16; CENTER OF GRAVITY 19. y 


Find the center of gravity (x,y) of a mass of density A 
J(x, y) = 1 in the given region R. 
a 
12. 5 
y ; a 
h 2 
| 
20. 
1 b Xx 
36 
h 
13. y 
h 
| | 
| 
a Fi b a x 
on ee eG 
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10.4 Green’s Theorem in the Plane 


THEOREM -1 


Double integrals over a plane region may be transformed into line integrals over the 
boundary of the region and conversely. This is of practical interest because it may simplify 
the evaluation of an integral. It also helps in theoretical work when we want to switch from 
one kind of integral to the other. The transformation can be done by the following theorem. 


Green’s Theorem in the Plane* 
(Transformation between Double Integrals and Line Integrals) 


Let R be a closed bounded region (see Sec. 10.3) in the xy-plane whose boundary 
C consists of finitely many smooth curves (see Sec. 10.1). Let Fy(x, y) and Fo(x, y) 
be functions that are continuous and have continuous partial derivatives dF \/dy 
and 0F2/dx everywhere in some domain containing R. Then 


OFg OFy 
(1) [| (G@-P)aw- p(Fide + Pedy) 
5 Ox oy C 


Here we integrate along the entire boundary C of R in such a sense that R is on 
the left as we advance in the direction of integration (see Fig. 234). 


x 


Fig. 234. Region R whose boundary C consists of two parts: 
C, is traversed counterclockwise, while C, is traversed clockwise 
in such a way that R is on the left for both curves 


Setting F = [Fy, Fo] = Fyi + Foj and using (1) in Sec. 9.9, we obtain (1) in vectorial 
form, 


(1’) | | cont + kar ay = pF. 


R Cc 


The proof follows after the first example. For ¢ see Sec. 10.1. 


4GEORGE GREEN (1793-1841), English mathematician who was self-educated, started out as a baker, and 
at his death was fellow of Caius College, Cambridge. His work concerned potential theory in connection with 
electricity and magnetism, vibrations, waves, and elasticity theory. It remained almost unknown, even in England, 
until after his death. 

A “domain containing R” in the theorem guarantees that the assumptions about F' and Fy, at boundary points 
of R are the same as at other points of R. 
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EXAMPLE 1 __ Verification of Green’s Theorem in the Plane 


Green’s theorem in the plane will be quite important in our further work. Before proving it, let us get used to 
it by verifying it for Fy y? Ty, Fg = 2xy + 2x and C the circle xe + y? =1. 


Solution. 1n (1) on the left we get 


OF 5 OF y 
[IC ) aay [ fie + 2) — (2y lax dy = 9| fac dy = 9m 
7 Ox oy ie 


R 


since the circular disk R has area 77. 
We now show that the line integral in (1) on the right gives the same value, 977. We must orient C 
counterclockwise, say, r(t) = [cos ¢, sin t]. Then r’(t) = [—sin ¢, cos f], and on C, 


F, = y? — Ty = sin? t — 7sint, Fg = 2xy + 2x = 2 costsint + 2 cost. 


Hence the line integral in (1) becomes, verifying Green’s theorem, 


20 
por + Foy’) dt = | [(sin? ¢ — 7 sin t)(—sin t) + 2(cos t sin t + cos f)(cos f)] dt 
c 0 


2a 
| (-sin® t + 7 sin? t + 2 cos? sin t + 2 cos? t) dt 
0 


0+ 77 —0+4 27 = 977. B 


PROOF _ Weprove Green’s theorem in the plane, first for a special region R that can be represented 
in both forms 


aZ=x2b, u(x) Sy Sv) (Fig. 235) 
and 
cSysd, py) Sx=q0) (Fig. 236) 
y y 
Cr* 
| 
cy 
u(x) 
° - - ° 
a b x x 
Fig. 235. Example of a special region Fig. 236. Example of a special region 


Using (3) in the last section, we obtain for the second term on the left side of (1) taken 
without the minus sign 


aFy : 
(2) | [Rreo=|| 
dy 


v(@) 
| OF, 
R a Ua) 


= “| dx (see Fig. 235). 
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(The first term will be considered later.) We integrate the inner integral: 


y=v(x) 
= Fy[x, vx)] — Fi [x, u@)]. 


v(x) an 
—_ dy = FiG, y) 
y=uca) 


we) OY 


By inserting this into (2) we find (changing a direction of integration) 


b 


b 
OF y 
| [Stacay = | Fy [x, v(x)] dx — | Fy [x, u(x)] dx 
R 


a a 


b 


-| Fy[x, v(x)] dx — | Fy[x, u(x)] dx. 


b a 


Since y = u(x) represents the curve C** (Fig. 235) and y = u(x) represents C*, the last 
two integrals may be written as line integrals over C** and C* (oriented as in Fig. 235); 


therefore, 
OF y 
(3) | [Sacay = -| Fy(x, y) dx — | Fy(x, y) dx 
R y (orn Cx 
=- price y) dx. 
Cc 


This proves (1) in Green’s theorem if Fy = 0. 

The result remains valid if C has portions parallel to the y-axis (such as C and C in 
Fig. 237). Indeed, the integrals over these portions are zero because in (3) on the right we 
integrate with respect to x. Hence we may add these integrals to the integrals over C* and 
C** to obtain the integral over the whole boundary C in (3). 

We now treat the first term in (1) on the left in the same way. Instead of (3) in the last 
section we use (4), and the second representation of the special region (see Fig. 236). 
Then (again changing a direction of integration) 


aFy d- (Wor, 
| [G2 aa= |] as|ay 
Ox Ox 
R c ply) 
d c 
= | Fo(qQ), y) dy + | F2(p(y), y) dy 


c d 


= p Face y) dy. 
cC 


Together with (3) this gives (1) and proves Green’s theorem for special regions. 

We now prove the theorem for a region R that itself is not a special region but can be 
subdivided into finitely many special regions as shown in Fig. 238. In this case we apply 
the theorem to each subregion and then add the results; the left-hand members add up to 
the integral over R while the right-hand members add up to the line integral over C plus 
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EXAMPLE 2 


EXAMPLE 3 
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AN 


a 


C* 


fe. ° 
x 


Fig. 237. Proof of Green’s theorem Fig. 238. Proof of Green’s theorem 


integrals over the curves introduced for subdividing R. The simple key observation now 
is that each of the latter integrals occurs twice, taken once in each direction. Hence they 


cancel each other, leaving us with the line integral over C. 


The proof thus far covers all regions that are of interest in practical problems. To prove 
the theorem for a most general region R satisfying the conditions in the theorem, we must 


x 


approximate R by a region of the type just considered and then use a limiting process. 


For details of this see Ref. [GenRef4] in App. 1. 


Some Applications of Green’s Theorem 


Area of a Plane Region as a Line Integral Over the Boundary 


In (1) we first choose F, = 0, Fg = x and then F, = —y, Fy = 0. This gives 


| J avay= pray and | J avay = ~$ yds 
Cc c 


R R 


respectively. The double integral is the area A of R. By addition we have 


(4) A= 


N[e- 


} (dy — yay 
@ 


where we integrate as indicated in Green’s theorem. This interesting formula expresses the area of R in terms 
of a line integral over the boundary. It is used, for instance, in the theory of certain planimeters (mechanical 


instruments for measuring area). See also Prob. 11. 


For an ellipse x?/ a + y?/b? = 1 or x =acost,y = bsint we get x’ = —asint, y’ = bcos t; thus from 


(4) we obtain the familiar formula for the area of the region bounded by an ellipse, 


2a Pus 
1 
A | (xy’ — yx’) dt | [ab cos? t — (—ab sin? t)] dt = tab. 
2 Jy 21 


Area of a Plane Region in Polar Coordinates 


Let r and @ be polar coordinates defined by x = rcos 6, y = rsin 6. Then 


dx = cos 6 dr — rsin @ d6, dy = sin@ dr + rcos 6 dé, 
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and (4) becomes a formula that is well known from calculus, namely, 
(5) A=- ; r? dO. 
@ 


As an application of (5), we consider the cardioid r = a(1 — cos 0), where 0 S 6 S 277 (Fig. 239). We find 


Qar 3¢r 
| (1 ~ cos 6)? d9 = “a”. =| 
0 


Transformation of a Double Integral of the Laplacian of a Function 
into a Line Integral of Its Normal Derivative 


The Laplacian plays an important role in physics and engineering. A first impression of this was obtained in 
Sec. 9.7, and we shall discuss this further in Chap. 12. At present, let us use Green’s theorem for deriving a 
basic integral formula involving the Laplacian. 

We take a function w(x, y) that is continuous and has continuous first and second partial derivatives in a 
domain of the xy-plane containing a region R of the type indicated in Green’s theorem. We set Fy = —dw/dy 
and Fy = dw/dx. Then dFy/dy and dF5/dx are continuous in R, and in (1) on the left we obtain 


dFn OF, dw dw . 
(6) x+y = Vw, 
Ox oy Ox oy 


the Laplacian of w (see Sec. 9.7). Furthermore, using those expressions for Fy and Fy, we get in (1) on the right 


dx awde a 
(7) a dx + Fpdy) ; G + Fp 2 as ; ( ee ) ae 
c c ds ds c\ dy ds dx ds 


where s is the arc length of C, and C is oriented as shown in Fig. 240. The integrand of the last integral may 
be written as the dot product 


(8) (sadeibean i ow . E dx dw dy dw dx 


ax ay ds’ ds ax ds ay ds” 


The vector n is a unit normal vector to C, because the vector r’(s) = dr/ds = [dx/ds, dy/ds] is the unit tangent 
vector of C, and r’ «n = 0, so that n is perpendicular to r’. Also, n is directed to the exterior of C because in 
Fig. 240 the positive x-component dx/ds of r’ is the negative y-component of n, and similarly at other points. From 
this and (4) in Sec. 9.7 we see that the left side of (8) is the derivative of w in the direction of the outward normal 
of C. This derivative is called the normal derivative of w and is denoted by dw/dn; that is, dw/dn = (grad w) ¢ n. 
Because of (6), (7), and (8), Green’s theorem gives the desired formula relating the Laplacian to the normal derivative, 


ow 
(9) | [vwacay = ; on oS: 
R (ep te 
For instance, w = x” — y? satisfies Laplace’s equation V?w = 0. Hence its normal derivative integrated over a closed 
curve must give 0. Can you verify this directly by integration, say, for the square0 Sx =1,0Sy 1? a 
y 


Y’ 


Fig. 239. Cardioid Fig. 240. Example 4 
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Green’s theorem in the plane can be used in both directions, and thus may aid in the 
evaluation of a given integral by transforming the given integral into another integral that 
is easier to solve. This is illustrated further in the problem set. Moreover, and perhaps 
more fundamentally, Green’s theorem will be the essential tool in the proof of a very 
important integral theorem, namely, Stokes’s theorem in Sec. 10.9. 


PROBLEM SET 10-4 


1-10 


LINE INTEGRALS: EVALUATION 


BY GREEN’S THEOREM 


Evaluate | F(r) ¢ dr counterclockwise around the boundary 


C 
1 
2. 


ot nAwm sp 


11. 


12. 


Cc 
of the region R by Green’s theorem, where 


. F = [y, —x], C the circle x? + y? = 1/4 


_F= [6y", 2¢:= 2y*I, R the square with vertices 
+(2, 2), =(2, —2) 


. F = [xeY, y2e"], R the rectangle with vertices (0, 0), 
(2, 0), (2, 3), (0, 3) 


. F = [x cosh 2y, 2x? sinh 2y], R: = yx 
. F = [x2 + y?, x? — y?], Rilsys2-x? 
F = [cosh y, —sinh x], 
. F = grad (x? cos” (xy)), Ras in Prob. 5 

R the semidisk 


R1lSxS3,x Sy = 3x 


F = [-e “cosy, —e “sin y], 
x2 + y? <= 16, x20 


. F = fe" Yinx + 2x], Ril +x+Sys2 

. F = [x*y?, -x/y"], Ril Sx? +y? 54,720, 

y 2x. Sketch R. 

CAS EXPERIMENT. Apply (4) to figures of your 


choice whose area can also be obtained by another 
method and compare the results. 


PROJECT. Other Forms of Green’s Theorem in 
the Plane. Let R and C be as in Green’s theorem, r’ 
a unit tangent vector, and n the outer unit normal vector 
of C (Fig. 240 in Example 4). Show that (1) may be 
written 


(10) | [aivraray = $remas 


R Cc 


or 


(11) [ freon F)¢kdx dy = pRer'ds 
R (8: 


where k is a unit vector perpendicular to the xy-plane. 
Verify (10) and (11) for F = [7x, —3y] and C the circle 
x2 + y? = 4 as well as for an example of your own 
choice. 


13-17 


INTEGRAL 


Using (9), find the value of | 


OF THE NORMAL DERIVATIVE 


OW ds taken counterclockwise 
Cc on 


over the boundary C of the region R. 


13. 


14. 
15. 


16. 


17. 


18. 


19. 


20. 


w = coshx, R the triangle with vertices (0, 0), (4, 2), 


(0, 2). 


2 


w = x2y + xy”, Rix? +y?S1,x20,y20 


w=e™cosy + xy, Ril SyS10—-x7,x=0 


W=x2+ y?, Cx? + y? = 4. Confirm the answer 
by direct integration. 


|x| =2 


Laplace’s equation. Show that for a solution w(x, y) 
of Laplace’s equation V?w = 0 in a region R with 
boundary curve C and outer unit normal vector n, 


Show that w = e”sin y satisfies Laplace’s equation 
V-w = 0 and, using (12), integrate w(dw/dn) counter- 
clockwise around the boundary curve C of the rectangle 
03%52,05yS85. 


Same task as in Prob. 19 when w = x” + y? and C 
the boundary curve of the triangle with vertices (0, 0), 
(1, 0), (0, 1). 
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10.5 Surfaces for Surface Integrals 


Whereas, with line integrals, we integrate over curves in space (Secs. 10.1, 10.2), with 
surface integrals we integrate over surfaces in space. Each curve in space is represented 
by a parametric equation (Secs. 9.5, 10.1). This suggests that we should also find 
parametric representations for the surfaces in space. This is indeed one of the goals of 
this section. The surfaces considered are cylinders, spheres, cones, and others. The 
second goal is to learn about surface normals. Both goals prepare us for Sec. 10.6 on 
surface integrals. Note that for simplicity, we shall say “surface” also for a portion of 
a surface. 


Representation of Surfaces 


Representations of a surface S in xyz-space are 


(1) z=f@y) or g@y,z2=9. 


For example, z = + Va? — x? — ve or x? + y? + 27 — a? =0(¢=0) represents a 
hemisphere of radius a and center 0. 

Now for curves C in line integrals, it was more practical and gave greater flexibility to 
use a parametric representation r = r(t), where a St Sb. This is a mapping of the 
interval a = t S bd, located on the f-axis, onto the curve C (actually a portion of it) in 
xyz-space. It maps every ¢ in that interval onto the point of C with position vector r(?). 
See Fig. 241A. 

Similarly, for surfaces S in surface integrals, it will often be more practical to use a 
parametric representation. Surfaces are two-dimensional. Hence we need two parameters, 


Curve C 
in space 
Surface S 
r(u,v) in space 
% y 
t 
— -_-———————1 — 
a b 
(t-axis) 
(uv-plane) 
(A) Curve (B) Surface 


Fig. 241. Parametric representations of a curve and a surface 
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which we call u and v. Thus a parametric representation of a surface S in space is of 
the form 


(2) r(u, V) = [x(u, v), y(u, Vv), Zu, V)] = x(u, v)i + yu, v)j + z(u, v)k 


where (u, VU) varies in some region R of the uwv-plane. This mapping (2) maps every point 
(u, V) in R onto the point of S with position vector r(u, v). See Fig. 241B. 


Parametric Representation of a Cylinder 


The circular cylinder x* + y? = a®, -1 Sz S 1, has radius a, height 2, and the z-axis as axis. A parametric 


representation is 


r(u, VU) = [acos u, a sin u, v] = acos ui + asinuj + vk (Fig. 242). 


The components of r are x = acos u, y = asin u, z = v. The parameters u, v vary in the rectangle R:0 Su S 
27, —1 Sv S lin the wv-plane. The curves u = const are vertical straight lines. The curves v = const are 
parallel circles. The point P in Fig. 242 corresponds to u = 7/3 = 60°, v = 0.7. ie) 


Fig. 242. Parametric representation Fig. 243. Parametric representation 
of a cylinder of a sphere 


Parametric Representation of a Sphere 


A sphere x? + y? + 2? = a? can be represented in the form 
(3) r(u, V) = acosv cos ui + acosv sinuj + asinvk 


where the parameters u, v vary in the rectangle R in the uv-plane given by the inequalities 0 = u S 27, 
—1/2 Sv S 77/2. The components of r are 


xX = acosv cos u, y = acosv sinu, Z=asinv. 
The curves uw = const and v = const are the “meridians” and “parallels” on S (see Fig. 243). This representation 


is used in geography for measuring the latitude and longitude of points on the globe. 
Another parametric representation of the sphere also used in mathematics is 


(3*) r(u, V) = acosusinvi + asinusinvj + acosvk 


whereO Su Z27,0 Sv 7. |_| 


SEC. 10.5 Surfaces for Surface Integrals 441 


EXAMPLE 3 


Parametric Representation of a Cone 


A circular cone z = Vx? + y?, 0 St S H can be represented by 


1 1 


r(u, Vv) = [wcos v, usin v, u] = ucos vit usinuj + uk, 


in components x = u cos v, y = usin v, z = u. The parameters vary in the rectangle R:0 Su SH,0 Sv S27. 
Check that x? + y? = 2, as it should be. What are the curves u = const and v = const? B 


Tangent Plane and Surface Normal 


Recall from Sec. 9.7 that the tangent vectors of all the curves on a surface S through a point 
P of S form a plane, called the tangent plane of S at P (Fig. 244). Exceptions are points where 
S has an edge or a cusp (like a cone), so that S cannot have a tangent plane at such a point. 
Furthermore, a vector perpendicular to the tangent plane is called a normal vector of S at P. 

Now since S can be given by r = r(u, v) in (2), the new idea is that we get a curve C 
on S by taking a pair of differentiable functions 


u = u(t), v = v(t) 
whose derivatives u’ = du/dt and v' = dv/dt are continuous. Then C has the position 


vector F(t) = r(u(f), U(t)). By differentiation and the use of the chain rule (Sec. 9.6) we 
obtain a tangent vector of C on S 


Hence the partial derivatives r,, and r, at P are tangential to S at P. We assume that they 
are linearly independent, which geometrically means that the curves u = const and 
v = const on S intersect at P at a nonzero angle. Then r,, and r, span the tangent plane 
of S at P. Hence their cross product gives a normal vector N of S at P. 


(4) IN| = Tey, os Ty = (0 


The corresponding unit normal vector n of S at P is (Fig. 244) 


(5) n= N = L Typ XR. 


S 


Fig. 244. Tangent plane and normal vector 
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Also, if S is represented by g(x, y, z) = 0, then, by Theorem 2 in Sec. 9.7, 


(5*) n= er g. 
| grad g| 
A surface S is called a smooth surface if its surface normal depends continuously on 
the points of S. 
S is called piecewise smooth if it consists of finitely many smooth portions. 
For instance, a sphere is smooth, and the surface of a cube is piecewise smooth 
(explain!). We can now summarize our discussion as follows. 


Tangent Plane and Surface Normal 


If a surface S is given by (2) with continuous r,, = dr/du andr, = dr/dv satisfying 
(4) at every point of S, then S has, at every point P, a unique tangent plane passing 
through P and spanned by r,, and r,, and a unique normal whose direction depends 
continuously on the points of S. A normal vector is given by (4) and the corresponding 
unit normal vector by (5). (See Fig. 244.) 


Unit Normal Vector of a Sphere 


From (5*) we find that the sphere g(x, y, z) x24 y? + 22 — q? = 0 has the unit normal vector 
xy Z Bier Sg ue S 
Wx, Y,2) = 15. 5-G at gl a 


We see that n has the direction of the position vector [x, y, z] of the corresponding point. Is it obvious that this 
must be the case? | 


Unit Normal Vector of a Cone 


At the apex of the cone g(x, y, z) zt Vx? 4 y? 0 in Example 3, the unit normal vector n becomes 


undetermined because from (5*) we get 


x y =1 1 ( x y ) 
n ; : i+ j—k). i] 
Ror a y?) V2(x? ae y?) = VI\Vx? + y? Vx + y? 


We are now ready to discuss surface integrals and their applications, beginning in the next 
section. 


PROBLEM SET 10-5 


PARAMETRIC SURFACE REPRESENTATION — 3. Cone r(u,v) = [wcosv, usin, cu] 
Familiarize yourself with parametric representations of 4. Elliptic cylinder rw, v) = [acosv, bsinv, ul 
important surfaces by deriving a representation (1), by 5. Paraboloid of revolution r(u,v) = [ucosv, usinov, 
finding the parameter curves (curves u = const and Ti 
v = const) of the surface and a normal vector N = ry X ry 6. Helicoid r(u, v) = [ucosv, usinv, v]. Explain the 
of the surface. Show the details of your work. name. 

1. xy-plane r(u, v) = (u,v) (thus wi + vj; similarly in 7. Ellipsoid r(u, v) = [acosu cosu, bcosvu sinu, 

Probs. 2-8). csinv] 
2. xy-plane in polar coordinates r(u, v) = [ucosv, usinv] 8. Hyperbolic paraboloid r(u, v) = [au cosh v, 


(thus u = r,v = @) bu sinh v, u"| 
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9. CAS EXPERIMENT. Graphing Surfaces, Depen- 14-19} DERIVE A PARAMETRIC 
dence on a, b, c. Graph the surfaces in Probs. 3-8. In REPRESENTATION 


Prob. 6 generalize the surface by introducing parame- 
ters a, b. Then find out in Probs. 4 and 6-8 how the 
shape of the surfaces depends on a, b, c. 


Find a normal vector. The answer gives one representation; 


there are many. Sketch the surface and parameter curves. 


14. Plane 4x + 3y + 2z = 12 
10. Orthogonal parameter curves wu = const and 15. Cylinder of revolution (x — 2)? + (y + 1)? = 25 
v = const on r(u, v) occur if and only if r,*r, = 0. 16. Ellipsoid x2 + y? + 327 =1 
Give examples. Prove it. 17. Sphere x? + (y + 2.8)? + (z — 3.2)? = 2.25 
. . — 2 4 
11. Satisfying (4). Represent the paraboloid in Prob. 5 so 18, Elliptic a ‘ a Var + ay 
that N(0, 0) # 0 and show N. 19. Hyperbolic cylinder x" — y" = 1 


12. 


13. 


Condition (4). Find the points in Probs. 1-8 at which 
(4) N # 0 does not hold. Indicate whether this results 
from the shape of the surface or from the choice of the 
representation. 


Representation z = f(x, y). Show that z = f(x, y) or 
g =z-—f(x, y) = 0can be written (f,, = df/du, etc.) 


r(u,v) = [u, v, and 


fu, v)] 
[ tur to: 1). 


(6) 


N = grad g 


20 


. PROJECT. Tangent Planes 7(P) will be less 
important in our work, but you should know how to 
represent them. 

(a) If S: r(u, v), then T(P): (r* — r Ty 
(a scalar triple product) or 

r*(p,q) = r(P) + pry(P) + gro(P). 
(b) If S: g(x, y, z) = 0, then 

T(P): (r* — r(P)) + Vg = 0. 
(c) If S:z = f(x, y), then 
T(P): 2* — 2 = (x* — Of P) + O* — yf P). 

Interpret (a)—(c) geometrically. Give two examples for 
(a), two for (b), and two for (c). 


r,) =0 


10.6 Surface Integrals 


To define a surface integral, we take a surface S, given by a parametric representation as 
just discussed, 


() r(u,v) = [x(u, v), yu, Vv), Zu, V)] = xu, Vv) + y(u, vj + zu, VK 


where (u, UV) varies over a region R in the uv-plane. We assume S to be piecewise smooth 
(Sec. 10.5), so that $ has a normal vector 


1 


and unit normal vector n= INI 
N 


(2) N=r, X Vy 
at every point (except perhaps for some edges or cusps, as for a cube or cone). For a given 
vector function F we can now define the surface integral over S by 


(3) | [remas = | [rew v)) * N(u, v) du dv. 


Ss R 


Here N = |N|n by (2), and IN| = Ir, x r, | is the area of the parallelogram with sides 
r,, and r,, by the definition of cross product. Hence 


(3*) ndA = n|N| du dv = Ndudav. 


And we see that dA = |N|du dv is the element of area of S. 
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Also F ¢ nis the normal component of F. This integral arises naturally in flow problems, 
where it gives the flux across S when F = pv. Recall, from Sec. 9.8, that the flux across 
S is the mass of fluid crossing S per unit time. Furthermore, p is the density of the fluid 
and v the velocity vector of the flow, as illustrated by Example 1 below. We may thus 
call the surface integral (3) the flux integral. 

We can write (3) in components, using F = [Fy, Fo, F3],N=[N1, No, Nal, 
and n = [cos a, cos B, cos y]. Here, a, B, y are the angles between n and the coordinate 
axes; indeed, for the angle between n and i, formula (4) in Sec. 9.2 gives cosa = 
ne i/|n| li] = ni, and so on. We thus obtain from (3) 


| [rena = | [reese + Fycos B + F3cos y) dA 
Ss Ss 
(4) 
= | foam + FoNo + F3N3) du dv. 


In (4) we can write cos a dA = dy dz, cos B dA = dz dx, cos y dA = dx dy. Then (4) 
becomes the following integral for the flux: 


(5) | [rena = | [rad + redede + Fads) 
Ss Ss 


We can use this formula to evaluate surface integrals by converting them to double integrals 
over regions in the coordinate planes of the xyz-coordinate system. But we must carefully 
take into account the orientation of S (the choice of n). We explain this for the integrals 
of the F3-terms, 


(5') | | Fscos yaa = | | Fsavay 


Ss Ss 


If the surface S is given by z = h(x, y) with (x, y) varying in a region R in the xy-plane, 
and if S is oriented so that cos y > 0, then (5') gives 


(5") | [rs cos y dA = +| [Fc y, h(x, y)) dx dy. 
Ss R 


But if cos y < 0, the integral on the right of (5”) gets a minus sign in front. This follows 
if we note that the element of area dx dy in the xy-plane is the projection |cos y| dA 
of the element of area dA of S; and we have cos y = +|cos y| when cos y > 0, but 
cos y = —|cos y| when cos y < 0. Similarly for the other two terms in (5). At the same 
time, this justifies the notations in (5). 

Other forms of surface integrals will be discussed later in this section. 


Flux Through a Surface 


Compute the flux of water through the parabolic cylinder S: y = x7,0 Sx S$ 2,0 Sz S3 (Fig. 245) if the 
velocity vector is v = F = [322, 6, 6xz], speed being measured in meters/sec. (Generally, F = pv, but water 
has the density p = 1 g/cm? = | ton/m*.) 
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Fig. 245. Surface S in Example 1 


2 


Solution. Writing x = u and z = v, we have y = x” = u”. Hence a representation of S is 


SS: r= [u, u,v] (0<u<2,0S0 S3). 
By differentiation and by the definition of the cross product, 
N=r,Xr,=[1, 2u, 0] x (0, 0, 1] =[2u, —-1, O). 


On S, writing simply F(S) for F[r(u, v)], we have F(S) = [3v2, 6, 6uv]. Hence F(S)* N = 6uv2 — 6. By 
integration we thus get from (3) the flux 


2 
dv 


u=0 


32 3 
| [rena = | [ cou? ~ 6) auav = | Guo? = 61 
Ss 


0-0 0 


3 


3 
= | (12u? — 12) dv = (4v? — 12v) 
0 


= 108 — 36 = 72 [m3/sec] 
v=0 


or 72,000 liters/sec. Note that the y-component of F is positive (equal to 6), so that in Fig. 245 the flow goes 
from left to right. 
Let us confirm this result by (5). Since 


N = |N|n = |N|[cos a, cos B, cos y] = [2u, 1, O] = [2x, 1, O] 


we see that cos a > 0, cos B < 0, and cos y = 0. Hence the second term of (5) on the right gets a minus sign, 
and the last term is absent. This gives, in agreement with the previous result, 


34 2 73 3 2 
[Jrenas=| | s%aa-[ [oacar= | 4ae%ae- [6-30 4-33-6-3-2=72. 
s a) 0 “0 0 0 


Surface Integral 


Evaluate (3) when F = [x2, 0, 3y?] and S is the portion of the plane x + y + z = 1 in the first octant (Fig. 246). 


Solution. Writing x = u and y = v, we have z = 1 — x — y = 1 — u — v. Hence we can represent the 
plane x + y + z = Ll inthe form r(u, v) = [u, v, 1 — u — v]. We obtain the first-octant portion S of this plane 
by restricting x = u and y = v to the projection R of S in the xy-plane. R is the triangle bounded by the two 
coordinate axes and the straight line x + y = 1, obtained from x + y + z= 1 by setting z= 0. Thus 
0:S2%5:1 —7,0S 75-1, 


1 
a ¥ 


Fig. 246. Portion of a plane in Example 2 
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By inspection or by differentiation, 
N=r, Xr =[1,0, -1] X [0, 1, -1] = [1, 1, 1). 


Hence F(S) * N = [u2, 0, 3v2] « [1, 1, 1] = u2 + 3u”. By (3), 


1 -l1-v 
[[remaa= | fo? + 30% dude = | | (u2 + 3v2) du dv 
Ss R 0-0 


1 1 
1 3 + 32 (1 di . BH 
E v) v"( v) | dv 3 


Orientation of Surfaces 


From (3) or (4) we see that the value of the integral depends on the choice of the unit 
normal vector n. (Instead of n we could choose —n.) We express this by saying that such 
an integral is an integral over an oriented surface S, that is, over a surface § on which 
we have chosen one of the two possible unit normal vectors in a continuous fashion. (For 
a piecewise smooth surface, this needs some further discussion, which we give below.) 
If we change the orientation of S, this means that we replace n with —n. Then each 
component of n in (4) is multiplied by —1, so that we have 


Change of Orientation in a Surface Integral 


The replacement of n by —n (hence of N by —N) corresponds to the multiplication 
of the integral in (3) or (4) by —1. 


In practice, how do we make such a change of N happen, if S is given in the form (1)? 
The easiest way is to interchange u and v, because then r,, becomes r, and conversely, 
so that N =r, X r, becomes r, X ry, = —ry, X ry = —N, as wanted. Let us illustrate 
this. 


Change of Orientation in a Surface Integral 


In Example 1 we now represent S by F = [v, v2, u],0 Sv S 2,0 =u S 3. Then 


N=f, x F, = [0,0, 1] x [1, 2u, 0] = [—2u, 1, 0]. 


For F = [3z7, 6, 6xz] we now get F(S) = [Bu?, 6, 6uv]. Hence F(S) N= —6u2v + 6 and integration gives the 
old result times —1, 


3/2 3 
| [F+W du { [i 6u7v + 6) du du [i 12u? + 12) du 72: B 
R 0 “0 0 


Orientation of Smooth Surfaces 


A smooth surface S$ (see Sec. 10.5) is called orientable if the positive normal direction, 
when given at an arbitrary point Pp of S, can be continued in a unique and continuous 
way to the entire surface. In many practical applications, the surfaces are smooth and thus 
orientable. 
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C C | 
n 


(a) Smooth surface 


n 


(b) Piecewise smooth surface 


Fig. 247. Orientation of a surface 


Orientation of Piecewise Smooth Surfaces 


Here the following idea will do it. For a smooth orientable surface S with boundary curve 
C we may associate with each of the two possible orientations of S an orientation of C, 
as shown in Fig. 247a. Then a piecewise smooth surface is called orientable if we can 
orient each smooth piece of S so that along each curve C* which is a common boundary 
of two pieces S$; and Sg the positive direction of C* relative to S$; is opposite to the 
direction of C* relative to Sg. See Fig. 247b for two adjacent pieces; note the arrows 
along C*. 


Theory: Nonorientable Surfaces 


A sufficiently small piece of a smooth surface is always orientable. This may not hold for 
entire surfaces. A well-known example is the Mébius strip,* shown in Fig. 248. To make 
a model, take the rectangular paper in Fig. 248, make a half-twist, and join the short sides 
together so that A goes onto A, and B onto B. At Fo take a normal vector pointing, say, 
to the left. Displace it along C to the right (in the lower part of the figure) around the strip 
until you return to Py and see that you get a normal vector pointing to the right, opposite 
to the given one. See also Prob. 17. 


Fig. 248. Médbius strip 


® AUGUST FERDINAND MOBIUS (1790-1868), German mathematician, student of Gauss, known for his 
work in surface theory, geometry, and complex analysis (see Sec. 17.2). 
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Surface Integrals Without Regard to Orientation 


Another type of surface integral is 


(6) | [owas = | [eww v))|N(u, v)| du dv. 
R 


Ss 


Here dA = |N|du dv = lr, x r, | du dv is the element of area of the surface S represented 
by (1) and we disregard the orientation. 

We shall need later (in Sec. 10.9) the mean value theorem for surface integrals, which 
states that if R in (6) is simply connected (see Sec. 10.2) and G(r) is continuous in a 
domain containing R, then there is a point (Ug, Vo) in R such that 


(7) | [owas = G(r(upg, Vo))A (A = Area of S). 
Ss 


As for applications, if G(r) is the mass density of S, then (6) is the total mass of S. If 
G = 1, then (6) gives the area A(S) of S, 


(8) A(S) = | [a= [fru % rl ae 
S R 


Examples 4 and 5 show how to apply (8) to a sphere and a torus. The final example, 
Example 6, explains how to calculate moments of inertia for a surface. 
Area of a Sphere 


For a sphere r(u,v) = [acosvucosu, acosusinu, asinv], OSuS27, —-7/2Svu 7/2 [see (3) 
in Sec. 10.5], we obtain by direct calculation (verify!) 


my Xr= [a cos” v cos u, a’ cos” v sin u, a’ cos v sin v]. 


Using cos? u + sin? u = 1 and then cos? v + sin? v = 1, we obtain 


1/2 


Iry x ri = a*(cos* v cos” u + cos* v sin? u + cos” v sin” vy = a?|cos vi. 


With this, (8) gives the familiar formula (note that |cos v| = cos v when —t/2 Sv = 7/2) 
m/2  p2a 77/2 
A(S) = 2? | |cos v| du dv = 2na?| cos v dv = 47ra?. ja 
—7/2 °0 —17/2 
Torus Surface (Doughnut Surface): Representation and Area 


A torus surface S is obtained by rotating a circle C about a straight line L in space so that C does not intersect 
or touch L but its plane always passes through L. If L is the z-axis and C has radius b and its center has distance 
a (> b) from L, as in Fig. 249, then $ can be represented by 


r(u, v) = (a + bcosv) cosui+ (a + bcosv)sinuj + bsinuk 
where 0 = u S 27,0 Sv S 27. Thus 


Ty = —(a + bcosv)sinui+t (a + bcos v) cos uj 
r, = —bsinv cosui— bsinvsinuj + bcosuk 


Ty X Yr, = b(a + bcos v)(cos ucosvi + sinucosv j + sinu k). 
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Hence |r,, X r,| = b(a + bcos v), and (8) gives the total area of the torus, 


Qa par 
(9) A(S) = | | b(a + bcosv) dudv = 41ab. B 
0 0 


x 


Fig. 249. Torus in Example 5 


Moment of Inertia of a Surface 


Find the moment of inertia / of a spherical lamina S: = xe + y? + 2? = a” of constant mass density and total 
mass M about the z-axis. 


Solution. If a mass is distributed over a surface S and (x, y, z) is the density of the mass (= mass per unit 
area), then the moment of inertia J of the mass with respect to a given axis L is defined by the surface integral 


(10) i= | [nora 
Ss 


where D(x, y, z) is the distance of the point (x, y, z) from L. Since, in the present example, is constant and S 
has the area A = 47ra”, we have w= M/A= M/(41ra”). 

For S we use the same representation as in Example 4. Then D? = x24 y? = a” cos” v. Also, as in that example, 
dA = a* cos vdu dv. This gives the following result. [In the integration, use cos? v = cos v (1 — sin? v).] 


w/2 p22 2 7/2 2 
Ma 2Ma 
I | | a‘ cos? v du dv = | cos? v du = . es 
Ama” J 72/0 ~1/2 3 


r= | [up?aa = 
Ss 


Representations z = f(x,y). If a surface S is given by z = f(x, y), then setting u = x, 
v = y,r = [u,v,f] gives 


IN] = Inu X ry! = 10,0, ful X (0, LAll = I-A fe Hl = V1 + £2 + £2 


and, since f,, = fx, fy = fy, formula (6) becomes 


2 


af af 
(11) | [ew dA = | |G. y, f(x, y)) il AP (=) aF (=) dx dy. 
Ox oy 


Re 
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Fig. 250. 


Formula (11) 


Here R* is the projection of S into the xy-plane (Fig. 250) and the normal vector N on S$ 
points up. If it points down, the integral on the right is preceded by a minus sign. 
From (11) with G = | we obtain for the area A(S) of S: z = f(x, y) the formula 


(12) 


apy”. (ary 
a= ffir) *G)ae 


where R* is the projection of S into the xy-plane, as before. 


PROBLEM SET 10-6 


1-10 


FLUX INTEGRALS (3) lr endA 
s 


Evaluate the integral for the given data. Describe the kind 
of surface. Show the details of your work. 


1. F = [-x?,y?,0], Sir = [w,v, 3u — 2v], 
OSHE 15, —2e0S2 

2. F=[e%,e",1], Sixty+z=1, x20, yZO, 
z20 

3. F=[0,x,0], Six? +y?+2=1, x=0, 
y20, 220 

4. F = [e¥, -e*,e*], Six? +y?=25, x20, 


. F = [cosh y, 0, sinh x], 


y2=0, 05752 


. F = [x,y,z], S:r = [wcosv, u sinv, u"], 


—T Svs 
2 
Szaexty, 


OSus4, 
OS y Sx 
OSx2=1 


. F = [0, sin y, cos z],_ S the cylinder x = y?, where 


OSys7/4and08z8y 


8. F = [tan xy, x, y], S: y? te=1, 25x55, 
y20, 220 

9. F = [0, sinh z,coshx], Six? + 22 = 4, 
0Sx=1/V2, 0Sy55, z= 

10. F=[y74.x%,24) Siz =4Vix7%74+y7, O82858, 
y=0 

11. CAS EXPERIMENT. Flux Integral. Write a pro- 


gram for evaluating surface integrals (3) that prints 
intermediate results (F, F « N, the integral over one of 


the two variables). Can you obtain experimentally some 
rules on functions and surfaces giving integrals that can 
be evaluated by the usual methods of calculus? Make 
a list of positive and negative results. 


12-16 


SURFACE INTEGRALS (6) | | G(r) dA 


Evaluate these integrals for the following data. Indicate the 
kind of surface. Show the details. 


12. 


13. 


14. 


15. 


16. 


17. 


G=cosx + sinx, S the portion of x +y+z= 1 
in the first octant 

G=xtytz z=xt+2y, 0OSxS7, 
OSysSx 

G=ax +t by + cz, Six? 4 y? + 2=1, y=0, 
z=0 

G = (1 + 9xz)?/", Sir =[u,v,u7], OSuZl, 
—23052 

G = arctan (y/x), S:z= x2 + y, 1=z2=9, 
x20, y=0 

Fun with Mobius. Make Mobius strips from long slim 


rectangles R of grid paper (graph paper) by pasting the 
short sides together after giving the paper a half-twist. 
In each case count the number of parts obtained by 
cutting along lines parallel to the edge. (a) Make R three 
squares wide and cut until you reach the beginning. 
(b) Make R four squares wide. Begin cutting one square 
away from the edge until you reach the beginning. Then 
cut the portion that is still two squares wide. (c) Make 
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18. 


R five squares wide and cut similarly. (d) Make R six 
squares wide and cut. Formulate a conjecture about the 
number of parts obtained. 

Gauss “Double Ring” (See Mobius, Works 2, 518- 
559). Make a paper cross (Fig. 251) into a “double ring” 
by joining opposite arms along their outer edges (without 
twist), one ring below the plane of the cross and the other 
above. Show experimentally that one can choose any four 
boundary points A, B, C, D and join A and C as well as 
B and D by two nonintersecting curves. What happens if 
you cut along the two curves? If you make a half-twist 
in each of the two rings and then cut? (Cf. E. Kreyszig, 
Proc. CSHPM 13 (2000), 23-43.) 


o~ 


b 


Fig. 251. Problem 18. Gauss “Double Ring” 


APPLICATIONS 


19. 


20. 


Center of gravity. Justify the following formulas for 
the mass M and the center of gravity (x, y, z) of a lamina 
S of density (mass per unit area) o (x, y, z) in space: 


1 
M= | [ovaa, r= | [aoa 
Ss Ss 
e223) (et 
Zz M ZO a 
Ss 


Moments of inertia. Justify the following formulas 
for the moments of inertia of the lamina in Prob. 19 
about the x-, y-, and z-axes, respectively: 


i [Jo + 2)odA, ly = [Jo + 2% dA, 
Ss Ss 


= [Jo + yo dA. 


Ss 


21. 
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Find a formula for the moment of inertia of the lamina 
in Prob. 20 about the line y = x,z = 0. 


22-23 


Find the moment of inertia of a lamina S of density 


1 about an axis B, where 


22. 


23. 
24. 


25. 


26. 


Six? + y? =1,0SzSh, B: the line z=hA/2 in 
the xz-plane 

S: x? y? = 22, OSzsSh, B: the z-axis 
Steiner’s theorem.° If Jp is the moment of inertia of 
a mass distribution of total mass M with respect to a line 
B through the center of gravity, show that its moment 
of inertia Jx with respect to a line K, which is parallel 
to B and has the distance k from it is 


Tg = Ip + k?M. 


Using Steiner’s theorem, find the moment of inertia of 
amass of density | on the sphere S: x2 + y? +2 =1 
about the line K:x = 1, y = 0 from the moment of 
inertia of the mass about a suitable line B, which you 
must first calculate. 


TEAM PROJECT. First Fundamental Form of S. 
Given a surface S: r(u, v), the differential form 


(13) ds? = Edu? + 2Fdudv + Gdv? 


with coefficients (in standard notation, unrelated to F, 
G elsewhere in this chapter) 


(14) E=ry,°', FH=HQKrTrMycery, G=7r7H*ry 


is called the first fundamental form of S. This form 
is basic because it permits us to calculate lengths, 
angles, and areas on S. To show this prove (a)-(c): 


(a) For a curve C: u = u(t),v = v(t),a StSb, on 
S, formulas (10), Sec. 9.5, and (14) give the length 


b 
c= | Vr' er’ dt 


(15) b 
i | V Eu’? + 2Fu'v' + Gu’? dt. 
a 


(b) The angle y between two intersecting curves 
Cy u = g(t),v = h(t) and Co: u = p(t), v = q(t) on 
S: r(u, v) is obtained from 
— acb 

|a||b| 
wherea = r,,g’ + rh’ andb = r,,p’ + r,q’ are tan- 
gent vectors of Cy and Co. 


(16) cos y 


8JACOB STEINER (1796-1863), Swiss geometer, born in a small village, learned to write only at age 14, 
became a pupil of Pestalozzi at 18, later studied at Heidelberg and Berlin and, finally, because of his outstanding 
research, was appointed professor at Berlin University. 
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(c) The square of the length of the normal vector N 
can be written 


(17) IN|? = |r, X r,|? = EG — F?, 


so that formula (8) for the area A(S) of S becomes 


acy = || aa = |{ INI du dv 
R 


Ss 
(18) 


= || VEGF aa 


R 


(d) For polar coordinates u (= r) and v (= @) defined 
by x = ucosv,y = usinv we have E = 1, F = 0, 
G= u?, so that 

ds” = du® + u? dv? = dr? + r? dé”. 
Calculate from this and (18) the area of a disk of 
radius a. 


10.7 Triple Integrals. 
Divergence Theorem of Gauss 


(e) Find the first fundamental form of the torus in 
Example 5. Use it to calculate the area A of the torus. 
Show that A can also be obtained by the theorem of 
Pappus,’ which states that the area of a surface of 
revolution equals the product of the length of a 
meridian C and the length of the path of the center of 
gravity of C when C is rotated through the angle 277. 


(f) Calculate the first fundamental form for the usual 
representations of important surfaces of your own 
choice (cylinder, cone, etc.) and apply them to the 
calculation of lengths and areas on these surfaces. 


In this section we discuss another “big” integral theorem, the divergence theorem, which 
transforms surface integrals into triple integrals. So let us begin with a review of the 


latter. 


A triple integral is an integral of a function f(x, y, z) taken over a closed bounded, 
three-dimensional region T in space. (Note that “closed” and “bounded” are defined in 
the same way as in footnote 2 of Sec. 10.3, with “sphere” substituted for “‘circle’”). We 
subdivide T by planes parallel to the coordinate planes. Then we consider those boxes of 
the subdivision that lie entirely inside 7, and number them from | to n. Here each box 
consists of a rectangular parallelepiped. In each such box we choose an arbitrary point, 
say, (Xk, Ye, Zk) in box k. The volume of box k we denote by AY... We now form the sum 


In = >, fhe Yo Ze) AVE 


k=1 


This we do for larger and larger positive integers n arbitrarily but so that the maximum 
length of all the edges of those n boxes approaches zero as n approaches infinity. This 
gives a sequence of real numbers J,,,, Jn,,°*:. We assume that f(x, y, z) is continuous in a 
domain containing T, and T is bounded by finitely many smooth surfaces (see Sec. 10.5). 
Then it can be shown (see Ref. [GenRef4] in App. 1) that the sequence converges to 
a limit that is independent of the choice of subdivisions and corresponding points 


7PAPPUS OF ALEXANDRIA (about A p. 300), Greek mathematician. The theorem is also called Guldin’s 
theorem. HABAKUK GULDIN (1577-1643) was born in St. Gallen, Switzerland, and later became professor 


in Graz and Vienna. 
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(Xk; Yo Zz). This limit is called the triple integral of f(x, y, z) over the region T and is 


denoted by 
|| [re y, z) dx dy dz or by || [re y, z) dV. 
T T 


Triple integrals can be evaluated by three successive integrations. This is similar to the 
evaluation of double integrals by two successive integrations, as discussed in Sec. 10.3. 
Example 1 below explains this. 


Divergence Theorem of Gauss 


Triple integrals can be transformed into surface integrals over the boundary surface of a 
region in space and conversely. Such a transformation is of practical interest because one 
of the two kinds of integral is often simpler than the other. It also helps in establishing 
fundamental equations in fluid flow, heat conduction, etc., as we shall see. The transformation 
is done by the divergence theorem, which involves the divergence of a vector function 
F= [F1, Fo, F3] = Fyi =F Foj ale F3k, namely, 


OF, OF OF3 
(1) div F = + + (Sec. 9.8). 
Ox oy Oz 


Divergence Theorem of Gauss 
(Transformation Between Triple and Surface Integrals) 


Let T be a closed bounded region in space whose boundary is a piecewise smooth 
orientable surface S. Let F (x, y, z) be a vector function that is continuous and has 
continuous first partial derivatives in some domain containing T. Then 


o [[Jovear= [rena 


In components of F = [F1, Fo, Fs) and of the outer unit normal vector 
n=[cosa, cos PB, cos y] of S (as in Fig. 253), formula (2) becomes 


OF, OF2  dOF3 
ar oP dx dy dz 
Ox oy Oz 


(2*) = || (F, cos a + Fgcos B + F3cos y) dA 


S 
= || (F, dy dz ote F5 dz dx at F3 dx dy). 
S 
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z 


Fig. 252. Surface S 
in Example 1 


PROOF 
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“Closed bounded region” is explained above, “piecewise smooth orientable” in Sec. 10.5, 
and “domain containing T” in footnote 4, Sec. 10.4, for the two-dimensional case. 
Before we prove the theorem, let us show a standard application. 


Evaluation of a Surface Integral by the Divergence Theorem 


Before we prove the theorem, let us show a typical application. Evaluate 


= {| (x? dy dz + xy dz dx oh xz dx dy) 
Ss 


where S is the closed surface in Fig. 252 consisting of the cylinder xe + y? =a" (0 =z Sb) and the circular 
disks z = 0 and z = b (x? + y? <a’). 


Solution. Fy = x°, Fy = xy, F3 = x"z. Hence diy F = 3x? + x? + x? = 5x7. The form of the surface 
suggests that we introduce polar coordinates r, 6 defined by x = r cos 0, y = r sin 6 (thus cylindrical coordinates 
r, 0, z). Then the volume element is dx dy dz = r dr d6 dz, and we obtain 


b 27 pa 
T= {lj 5x" dx dy dz = | | | (Sr? cos” 0) r dr dO dz 
T 


z=0°0=0°r=0 


b Qa a b Sar 
=5/ | © costodvdc= 5 a’b. ea 
z=0 


We prove the divergence theorem, beginning with the first equation in (2*). This 
equation is true if and only if the integrals of each component on both sides are equal; 
that is, 


aFy 
(3) Nes avdyde = || Freos ada, 
x 


Ss 


(4) | [[ sy arora =|] Fees 
Ss 


(5) ||| GB ararae= || Fscos yaa. 
T Ss 


We first prove (5) for a special region T that is bounded by a piecewise smooth 
orientable surface S and has the property that any straight line parallel to any one of the 
coordinate axes and intersecting T has at most one segment (or a single point) in common 
with T. This implies that T can be represented in the form 


(6) gy) Sz Sh, y) 


where (x, y) varies in the orthogonal projection R of T in the xy-plane. Clearly, z = g(x, y) 
represents the “bottom” Sy of S (Fig. 253), whereas z = h(x, y) represents the “top” Sy of 
S, and there may be a remaining vertical portion S3 of S. (The portion Ss; may degenerate 
into a curve, as for a sphere.) 
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To prove (5), we use (6). Since F is continuously differentiable in some domain containing 
T, we have 


aF3 
(7) |[[Paaac=|| 
0z 
T 


MOD oF. 
dx dy. 


—d 
J. az 


g@, y) 


Integration of the inner integral [--- ] gives F3[x, y, h(x, y)] — Fs[x, y, g(x, y)]. Hence the 
triple integral in (7) equals 


(8) | rts y, h(x, y)] dx dy ~ | rote ys g(x, y)] dx dy. 


R R 


x 


Fig. 253. Example of a special region 


But the same result is also obtained by evaluating the right side of (5); that is [see also 
the last line of (2*)], 


|| F3cos y dA = || Feacay 
s S 


= oF || Fx, y, h(x, y)] dx dy = ies y, a(x, y)] dx dy, 


where the first integral over R gets a plus sign because cos y > 0 on Sj in Fig. 253 [as 
in (5”), Sec. 10.6], and the second integral gets a minus sign because cos y < 0 on So. 
This proves (5). 

The relations (3) and (4) now follow by merely relabeling the variables and using the 
fact that, by assumption, T has representations similar to (6), namely, 


Wy.) SxShly,d and zx) Sy Sh». 


This proves the first equation in (2*) for special regions. It implies (2) because the left side 

of (2*) is just the definition of the divergence, and the right sides of (2) and of the first 

equation in (2*) are equal, as was shown in the first line of (4) in the last section. Finally, 

equality of the right sides of (2) and (2*), last line, is seen from (5) in the last section. 
This establishes the divergence theorem for special regions. 
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For any region T that can be subdivided into finitely many special regions by means of 
auxiliary surfaces, the theorem follows by adding the result for each part separately. This 
procedure is analogous to that in the proof of Green’s theorem in Sec. 10.4. The surface 
integrals over the auxiliary surfaces cancel in pairs, and the sum of the remaining surface 
integrals is the surface integral over the whole boundary surface S of T; the triple integrals 
over the parts of T add up to the triple integral over 7. 

The divergence theorem is now proved for any bounded region that is of interest in 
practical problems. The extension to a most general region T of the type indicated in the 
theorem would require a certain limit process; this is similar to the situation in the case 
of Green’s theorem in Sec. 10.4. o 


Verification of the Divergence Theorem 


Evaluate {| (7xi — zk) «ndA over the sphere S: xe + y? +2=4 (a) by (2), (b) directly. 
Ss 
Solution. (a) div F = div [7x, 0, —z] = div [7xi — zk] = 7 — 1 = 6. Answer: 6 « ($)m + 2? = 647. 
(b) We can represent S by (3), Sec. 10.5 (with a = 2), and we shall use n dA = N du du [see (3*), Sec. 10.6]. 
Accordingly, 


S: r=([2cosucosu, 2cosusinu, 2sinu] 
Then r, = [-—2cosusinu, 2cosucosu, 0] 
r, = [-2sinucosu, —2sinusinu, 2cosv] 


N=r, Xr, = [4cos*ucosu, 4cos%usinu, 4cosv sinv]. 
Now on S we have x = 2 cos vu cos u, z = 2 sin v, so that F = [7x, 0, —z] becomes on S 


F(S) = [l4cosucosu, 0, —2sinv] 
and F(S) * N = (14 cos v cos u) « 4 cos” v cos u + (—2 sinv) - 4cos u sinv 


= 56 cos? v cos? u — 8 cos v sin” v. 


On S we have to integrate over u from 0 to 277. This gives 


7 + 56cos*®v — 27 + 8 cosu sin? v. 


The integral of cos v sin? v equals (sin? v)/3, and that of cos? v = cos v (1 — sin? v) equals sin v — (sin? v)/3. 
On S we have —7/2 = v = 77/2, so that by substituting these limits we get 


567(2 — 3) — low - 3 = 64a 


as hoped for. To see the point of Gauss’s theorem, compare the amounts of work. ea] 


Coordinate Invariance of the Divergence. The divergence (1) is defined in terms of 
coordinates, but we can use the divergence theorem to show that div F has a meaning 
independent of coordinates. 

For this purpose we first note that triple integrals have properties quite similar to those 
of double integrals in Sec. 10.3. In particular, the mean value theorem for triple integrals 
asserts that for any continuous function f(x, y, z) in a bounded and simply connected region 
T there is a point Q: (x9, yo, Zo) in T such that 


(9) || 70 y, 2) dV = f(xo, Yo, Zo) VT) = (V(T) = volume of T). 


r 
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In this formula we interchange the two sides, divide by V(T), and set f = div F. Then by 
the divergence theorem we obtain for the divergence an integral over the boundary surface 
S(T) of T, 


1 1 
10 div F(xo, yo, = —_ div FdV = —— || Fenda. 
= is Tall " Vr) Il 
T S(T) 
We now choose a point P: (x4, yy, Z1) in T and let T shrink down onto P so that the 
maximum distance d(T) of the points of T from P goes to zero. Then Q: (x9, yo, Zo) Must 


approach P. Hence (10) becomes 


(11) div F(P) = lors || Fenda. 


S(T) 


This proves 


Invariance of the Divergence 


The divergence of a vector function F with continuous first partial derivatives in a 
region T is independent of the particular choice of Cartesian coordinates. For any 
P in T it is given by (11). 


Equation (11) is sometimes used as a definition of the divergence. Then the representation (1) 
in Cartesian coordinates can be derived from (11). 

Further applications of the divergence theorem follow in the problem set and in the 
next section. The examples in the next section will also shed further light on the nature 
of the divergence. 


PROBLEM SET 10-7 


APPLICATION: MASS DISTRIBUTION 9-18] APPLICATION 
Find the total mass of a mass distribution of density a in OF THE DIVERGENCE THEOREM 


a region T in space. 


Lho=xrtyst 7, T the box |x| $4, |y| $1, 


Evaluate the surface integral | | F «nda by the divergence 
Ss 


05752 theorem. Show the details. 
2.0 = xyz, Tthebox0SxSa, 0OSySb, 9, F= [x?, 0, 27], S the surface of the box |x| = 1, 
OSz5c <= <,7,<= 
aoaye ly $3, OSzS2 
3.0=e"%°", T:08x81-y, 0SyH=l, ; . ; 
0=752 10. Solve Prob. 9 by direct integration. 
4. o as in Prob. 3, T the tetrahedron with vertices (0, 0, 0), 11. F = [e*,e’, e*], S the surface of the cube |x| =1, 
(3, 0, 0), (0, 3, 0), (0, 0, 3) bl si, ks 
5. 0 = sin2xcos2y, T:08x5 dn, 12. F= [x? y3, y? 2, 2 x], S the surface of 
47 —xSySqm, 05256 x+y? +2525, 220 
6.0 = x2y?z?, T the cylindrical region vr+es 16, 13. F = [sin y, cos x, cos z], S$, the surface of 
yl S4 xe + y? =4, |z)S2(a cylinder and two disks!) 
7. o = arctan (y/x), T: x7 + y? +27 a’, z=0 14. F as in Prob. 13, S$ the surface of x7 + y? = 9, 
8. ¢ = x" + y”, Tas in Prob. 7 03752 
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15. F= [2x2 3 y?, sin 77z], S the surface of the tetrahe- 19. The box -aS=xSa, —bSy=b, -cSzSc 
dron with vertices (0, 0, 0), (1, 0, 0), (0, 1,0), (0,0, 1) 20. The ball x2 + y? + 7 <a? 

16. F = [cosh x, z, y], S as in Prob. 15 21. The cylinder y? +2 a’, OSx%=h 


17. F= [x?, y’, 2), S the surface of the cone x? + y -_ 22, 22. The paraboloid y? +2s x OSX Hh 


OSzSh 


23. The cone y? + 27S x7, OSxSh 


18. F = [xy, yz, zx], S the surface of the cone xe y 24. Why is J, in Prob. 23 for large h larger than J, in Prob. 


=40.. 0=Z=2 22 (and the same h)? Why is it smaller for h = 1? Give 
19-23} APPLICATION: MOMENT OF INERTIA phyalcalineason. 
h 
: . ; ; : 7 
Given a mass of density | in a region T of space, find the 25. Show that for a solid of revolution, J, = = | r* (x) dx. 
moment of intertia about the x-axis 2 0 


i= [| + 27) dx dy dz. 
ae 


Solve Probs. 20-23 by this formula. 


10.8 Further Applications 
of the Divergence Theorem 


EXAMPLE 1 


The divergence theorem has many important applications: In fluid flow, it helps characterize 
sources and sinks of fluids. In heat flow, it leads to the heat equation. In potential theory, 
it gives properties of the solutions of Laplace’s equation. In this section, we assume that 
the region T and its boundary surface S' are such that the divergence theorem applies. 


Fluid Flow. Physical Interpretation of the Divergence 


From the divergence theorem we may obtain an intuitive interpretation of the divergence of a vector. For this 
purpose we consider the flow of an incompressible fluid (see Sec. 9.8) of constant density p = 1 which is steady, 
that is, does not vary with time. Such a flow is determined by the field of its velocity vector v(P) at any point P. 

Let S be the boundary surface of a region T in space, and let n be the outer unit normal vector of S. Then 
v © nis the normal component of v in the direction of n, and |v * n dA| is the mass of fluid leaving T (if v* n > 0 
at some P) or entering T (if v* n < 0 at P) per unit time at some point P of S through a small portion AS of 
S of area AA. Hence the total mass of fluid that flows across S from T to the outside per unit time is given by 


the surface integral 
| | vendaA. 


Ss 


Division by the volume V of T gives the average flow out of T: 


(1) 1| fvemaa, 
s 


Since the flow is steady and the fluid is incompressible, the amount of fluid flowing outward must be continuously 
supplied. Hence, if the value of the integral (1) is different from zero, there must be sources (positive sources 
and negative sources, called sinks) in T, that is, points where fluid is produced or disappears. 

If we let T shrink down to a fixed point P in T, we obtain from (1) the source intensity at P given by the 
right side of (11) in the last section with F ¢ n replaced by v « n, that is, 


1 
(2) div v(P) = _ lim aa | [yen 
a(T)0 V(T) 
S(T) 
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EXAMPLE 2 


Hence the divergence of the velocity vector v of a steady incompressible flow is the source intensity of the flow 
at the corresponding point. 
There are no sources in T if and only if div v is zero everywhere in T. Then for any closed surface S in T we have 


| | vendA = 0. ial 
Ss 
Modeling of Heat Flow. Heat or Diffusion Equation 


Physical experiments show that in a body, heat flows in the direction of decreasing temperature, and the rate of 
flow is proportional to the gradient of the temperature. This means that the velocity v of the heat flow in a body 
is of the form 


(3) v = —K grad U 


where U(x, y, z, f) is temperature, f is time, and K is called the thermal conductivity of the body; in ordinary 
physical circumstances K is a constant. Using this information, set up the mathematical model of heat flow, the 
so-called heat equation or diffusion equation. 


Solution. Let T be a region in the body bounded by a surface S with outer unit normal vector n such that 
the divergence theorem applies. Then v - n is the component of v in the direction of n, and the amount of heat 


leaving T per unit time is 
| | vendA. 


Ss 


This expression is obtained similarly to the corresponding surface integral in the last example. Using 


div (grad U) = V?U = Ugg + Uyy + Uxz 


(the Laplacian; see (3) in Sec. 9.8), we have by the divergence theorem and (3) 


| [vena =| | | eiv cera 0) aed ae 


Ss £ 


-K| | [v?udcay ae 
T 


On the other hand, the total amount of heat H in T is 


H= | | [oou aray ae 
T 


where the constant o is the specific heat of the material of the body and p is the density (= mass per unit volume) 
of the material. Hence the time rate of decrease of H is 


aH nm WU rg 
at OP ag Oe ae ae 
ie 


and this must be equal to the above amount of heat leaving T. From (4) we thus have 


aU 
[feo a dx dy dz = k] | [Pe udedyae 
T T 
aU 
[| [(co - xv2u) day de=0 


c 


(4) 


or 
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Since this holds for any region T in the body, the integrand (if continuous) must be zero everywhere; that is, 
(5) — = ou = — 


where c? is called the thermal diffusivity of the material. This partial differential equation is called the heat 
equation. It is the fundamental equation for heat conduction. And our derivation is another impressive 
demonstration of the great importance of the divergence theorem. Methods for solving heat problems will be 
shown in Chap. 12. 

The heat equation is also called the diffusion equation because it also models diffusion processes of motions 
of molecules tending to level off differences in density or pressure in gases or liquids. 

If heat flow does not depend on time, it is called steady-state heat flow. Then dU/dt = 0, so that (5) reduces 
to Laplace’s equation Y?U = 0. We met this equation in Secs. 9.7 and 9.8, and we shall now see that the 
divergence theorem adds basic insights into the nature of solutions of this equation. fs] 


Potential Theory. Harmonic Functions 


The theory of solutions of Laplace’s equation 


2 2 2 
(6) VF = ee ee 


is called potential theory. A solution of (6) with continuous second-order partial derivatives 
is called a harmonic function. That continuity is needed for application of the divergence 
theorem in potential theory, where the theorem plays a key role that we want to explore. 
Further details of potential theory follow in Chaps. 12 and 18. 


A Basic Property of Solutions of Laplace’s Equation 


The integrands in the divergence theorem are div F and F « n (Sec. 10.7). If F is the gradient of a scalar function, 
say, F = grad f, then div F = div (grad f) = V7, see (3), Sec. 9.8. Also, Fen = n* F = n° grad f. This is the 
directional derivative of fin the outer normal direction of S, the boundary surface of the region T in the theorem. 
This derivative is called the (outer) normal derivative of f and is denoted by df/dn. Thus the formula in the 
divergence theorem becomes 


(7) [| [vera = | [za 
Ss 


Ju 


This is the three-dimensional analog of (9) in Sec. 10.4. Because of the assumptions in the divergence theorem 
this gives the following result. | 


A Basic Property of Harmonic Functions 


Let f(x, y, z) be a harmonic function in some domain D is space. Let S be any 
piecewise smooth closed orientable surface in D whose entire region it encloses 
belongs to D. Then the integral of the normal derivative of f taken over S is zero. 
(For “‘piecewise smooth” see Sec. 10.5.) 
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EXAMPLE 4 


EXAMPLE 5 


Green’s Theorems 


Let f and g be scalar functions such that F = fgrad g satisfies the assumptions of the divergence theorem in 
some region 7. Then 


div F = div (fgrad g) 


| 9g og dg] 
= aiv(|r = f—if ) 


@ ag ay @ ag =) (z ag sey 
ax ax t 7 ax2) * ay ay |S ay? az az f2 


= f Vg + grad f+ grad g. 
Also, since f is a scalar function, 


Fen=neF 


n° (fgrad g) 
= (ne grad g) f. 


Now n° grad g is the directional derivative dg/dn of g in the outer normal direction of S. Hence the formula in 
the divergence theorem becomes “Green’s first formula” 


te) 
(8) [| Juve + grad f* grad g) dV = [|r dA. 
T Ss 


Formula (8) together with the assumptions is known as the first form of Green’s theorem. 
Interchanging f and g we obtain a similar formula. Subtracting this formula from (8) we find 


og of 
(9) | | | (fV?g — Vf) dV = | | (2 = :=) dA. 
in on 
av 


Ss 


This formula is called Green’s second formula or (together with the assumptions) the second form of Green’s 
theorem. 


Uniqueness of Solutions of Laplace’s Equation 


Let f be harmonic in a domain D and let f be zero everywhere on a piecewise smooth closed orientable surface 
S in D whose entire region T it encloses belongs to D. Then V2 is zero in T, and the surface integral in (8) is 
zero, so that (8) with g = f gives 


| | [ears era pav = | | Jlenssav = 0 


T 


Since f is harmonic, grad f and thus |grad f| are continuous in T and on S, and since |grad f| is nonnegative, 
to make the integral over T zero, grad f must be the zero vector everywhere in T. Hence f; = fy = f2 = 0, 
and f is constant in T and, because of continuity, it is equal to its value 0 on S. This proves the following 
theorem. 
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THEOREM 2 


THEOREM 3 


THEOREM 3* 
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Harmonic Functions 


Let f(x, y, z) be harmonic in some domain D and zero at every point of a piecewise 
smooth closed orientable surface S in D whose entire region T it encloses belongs 
to D. Then f is identically zero in T. 


This theorem has an important consequence. Let f; and f2 be functions that satisfy the assumptions of Theorem 
1 and take on the same values on S. Then their difference f, — fo satisfies those assumptions and has the value 
0 everywhere on S$. Hence, Theorem 2 implies that 


fi-f2=90 throughout r 


and we have the following fundamental result. 


Uniqueness Theorem for Laplace’s Equation 


Let T be a region that satisfies the assumptions of the divergence theorem, and let 
(x, y, z) be a harmonic function in a domain D that contains T and its boundary 
surface S. Then f is uniquely determined in T by its values on S. 


The problem of determining a solution u of a partial differential equation in a region T such that u assumes 
given values on the boundary surface S of T is called the Dirichlet problem.* We may thus reformulate Theorem 
3 as follows. 


Uniqueness Theorem for the Dirichlet Problem 


If the assumptions in Theorem 3 are satisfied and the Dirichlet problem for the 
Laplace equation has a solution in T, then this solution is unique. 


These theorems demonstrate the extreme importance of the divergence theorem in potential theory. | 


PROBLEEM—SET 10-8 


1. Harmonic functions. Verify Theorem | for f = 2z? — 
x? - y? and S$ the surface of the box 0 =x Sa, 
OSySb, 0878 c. 


1-6| VERIFICATIONS 3. Green’s first identity. Verify (8) for f = 4y?, g = x”, 


S the surface of the “unit cube” OSx31, 
OSyS1, 0Sz = 1. What are the assumptions on 
f and g in (8)? Must f and g be harmonic? 


2. Harmonic functions. Verify Theorem 1 for f= 
x? - y? and the surface of the cylinder + yy = 4, 


<7 


=Z= 


4. Green’s first identity. Verify (8) for f= x, 
g= y? +z”, § the surface of the box 0S x <1, 
0SyS2, 05753. 


8PETER GUSTAV LEJEUNE DIRICHLET (1805-1859), German mathematician, studied in Paris under 
Cauchy and others and succeeded Gauss at Gottingen in 1855. He became known by his important research on 
Fourier series (he knew Fourier personally) and in number theory. 
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5. Green’s second identity. Verify (9) for f = 6y%, 
g = 2x", S the unit cube in Prob. 3. 

6. Green’s second identity. Verify (9) for f = x’, 
g =y*, S the unit cube in Prob. 3. 


7-11| VOLUME 


Use the divergence theorem, assuming that the assumptions 
on T and S are satisfied. 


7. Show that a region T with boundary surface S has the 


volume 
| [raae- | [race = | feaeas 
Ss Ss Ss 


1 
=i] | oayde + ydeae + cara, 
s 


V= 


8. Cone. Using the third expression for v in Prob. 7, 
verify V = 7ra” h/3 for the volume of a circular cone 
of height / and radius of base a. 
9. Ball. Find the volume under a hemisphere of radius a 
from in Prob. 7. 
10. Volume. Show that a region T with boundary surface 
S has the volume 


1 
v= 2] Jreosgan 
s 


where r is the distance of a variable point P: (x, y, z) 
on S from the origin O and @ is the angle between 
the directed line OP and the outer normal of S at P. 


10.9. Stokes’s Theorem 


11. 


12. 
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Make a sketch. Hint. Use (2) in Sec. 10.7 with 
F = [x, y, z]. 

Ball. Find the volume of a ball of radius a from 
Prob. 10. 


TEAM PROJECT. Divergence Theorem and Poten- 
tial Theory. The importance of the divergence theo- 
rem in potential theory is obvious from (7)—(9) and 
Theorems 1-3. To emphasize it further, consider 
functions f and g that are harmonic in some domain D 
containing a region T with boundary surface S such that 
T satisfies the assumptions in the divergence theorem. 
Prove, and illustrate by examples, that then: 


) 
(a) [Jesus - | | [Ionasav, 
on 
Ss T 


(b) If dg/dn = 0 on S, then g is constant in T. 
ag of 
—-g~]dA= 
() NG on g =) 
s 
(d) If af/an = dg/dn on S, thenf = g + cin T, where 


c is a constant. 


(e) The Laplacian can be represented independently 
of coordinate systems in the form 


; 1 of 
Vf= 1 ar 
f= 4G) V(T) || Sar 
SD) 
where d(T) is the maximum distance of the points of a 
region T bounded by S(T) from the point at which the 
Laplacian is evaluated and V(T) is the volume of T. 


Let us review some of the material covered so far. Double integrals over a region in the plane 
can be transformed into line integrals over the boundary curve of that region and conversely, 
line integrals into double integrals. This important result is known as Green’s theorem in the 
plane and was explained in Sec. 10.4. We also learned that we can transform triple integrals 
into surface integrals and vice versa, that is, surface integrals into triple integrals. This “big” 
theorem is called Gauss’s divergence theorem and was shown in Sec. 10.7. 

To complete our discussion on transforming integrals, we now introduce another “big” 
theorem that allows us to transform surface integrals into line integrals and conversely, 
line integrals into surface integrals. It is called Stokes’s Theorem, and it generalizes 
Green’s theorem in the plane (see Example 2 below for this immediate observation). Recall 


from Sec. 9.9 that 


ijk 
(1) curl F = |d/dx d/dy d/dz 
fy We 6S 


which we will need immediately. 
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EXAMPLE 1 


CHAP. 10 Vector Integral Calculus. Integral Theorems 


Stokes’s Theorem’ 
(Transformation Between Surface and Line Integrals) 


Let S be a piecewise smooth? oriented surface in space and let the boundary of S 
be a piecewise smooth simple closed curve C. Let F (x, y, z) be a continuous vector 
function that has continuous first partial derivatives in a domain in space containing 
S. Then 


(2) | | cut +n aa = per!) ds 
Ss Cc 
Here n is a unit normal vector of S and, depending on n, the integration around C 
is taken in the sense shown in Fig. 254. Furthermore, r' = dr/ds is the unit tangent 
vector and s the arc length of C. 
In components, formula (2) becomes 


OF. OF. OF OF. OF. OF 

|| (= - NN (= - Sy, +(e) du dv 
dy az / } Oz ax J 2 ax dy J) 3 

R 


(2*) 
== Fade + Fo dy + F3 dz). 
Cc 
Here, F=([Fy, Fo, Fs], N=[Ni, No, Ng], ndA =Ndudv, rv’ ds = 


[dx, dy, dz], and R is the region with boundary curve C in the uv-plane 
corresponding to S represented by r(u, V). 


The proof follows after Example 1. 


Fig. 254. Stokes’s theorem Fig. 255. Surface S in Example 1 


Verification of Stokes’s Theorem 


Before we prove Stokes’s theorem, let us first get used to it by verifying it for F = [y, z, x] and S the paraboloid 
(Fig. 255) 


z=f(x,y) = 1 (x? t y?), z=20. 


°Sir GEORGE GABRIEL STOKES (1819-1903), Irish mathematician and physicist who became a professor 
in Cambridge in 1849. He is also known for his important contribution to the theory of infinite series and to 
viscous flow (Navier-Stokes equations), geodesy, and optics. 

“Piecewise smooth” curves and surfaces are defined in Secs. 10.1 and 10.5. 
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PROOF 


Solution. The curve C, oriented as in Fig. 255, is the circle r(s) = [cos s, sin s, 0]. Its unit tangent vector 
is r’ (s) = [sin s, cos s, 0]. The function F = [y, z, x] on C is F(r(s)) = [sin s, 0, cos s]. Hence 


20 


2a 
pr edr = | F(r(s)) ¢r’ (s) ds = | [(sin s)(—sin s) + 0 + O] ds TT. 
Cc 0 0 


We now consider the surface integral. We have Fy = y, Fz = z, F3 = x, so that in (2*) we obtain 


curl F = curl [Fy, Fo, F3])=curl[y, z, x] =[-l, —-1, —l]. 


A normal vector of S is N = grad (z — f(x, y)) = [2x, 2y, 1]. Hence (curl F)* N = —2x — 2y — 1. Now 
ndA = N dx dy (see (3*) in Sec. 10.6 with x, y instead of u, v). Using polar coordinates r, 6 defined by 
x = rcos 6, y = rsin@ and denoting the projection of S into the xy-plane by R, we thus obtain 


[| (com) +maa = | cue) -Nadcay = |[2- 2y — 1) dx dy 
Ss R R 


27 rl 
| | (—2r (cos 6 + sin @) — 1)rdr dé 
0=0"r=0 


2a 
2 ; 1 1 
| ( (cos 8 + sin @) ao 0+0 (277) T. ial 
bcos 3 2 2 


We prove Stokes’s theorem. Obviously, (2) holds if the integrals of each component on 
both sides of (2*) are equal; that is, 


OF 1 OF; 
(3) {| (2, = aN, a dv = p Rds 
Oz oy = 
R Cc 
OF 5 OF 5 
(4) ae + Fe 8 du dv = ) F, dy 
R Cc 
OF OF 
(5) wo = ay 2 du dv = Ce 
R 


We prove this first for a surface S that can be represented simultaneously in the forms 


(6) (a) z=f(x, y), (b) y= ge, 2), (c) x = h(y, z). 


We prove (3), using (6a). Setting u = x,v = y, we have from (6a) 
r(u, v) = r(x, y) = [x, y,f@ y)] = xi + yj + fk 


and in (2), Sec. 10.6, by direct calculation 


N=r, X lp = Vy X ty = [fe fy A= -fei-fyj +k. 
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Note that N is an upper normal vector of S, since it has a positive z-component. Also, 
R = S*, the projection of S into the xy-plane, with boundary curve C = C* (Fig. 256). 
Hence the left side of (3) is 


OF, oF 
7 —— (=f) = da dy. 
(7) | | ete ay (OD 
S* 
We now consider the right side of (3). We transform this line integral over C = C* into 


a double integral over S* by applying Green’s theorem [formula (1) in Sec. 10.4 with 
Fs = 0]. This gives 


Fig. 256. Proof of Stokes’s theorem 


Here, Fy = F (x, y, f(x, y)). Hence by the chain rule (see also Prob. 10 in Problem Set 9.6), 


OF 1x, y, fx, y)) OF \(x, y,2Z) OF 1x, y, 2) of 
oy oy Oz oy 


Iz = f(x, y)]. 


We see that the right side of this equals the integrand in (7). This proves (3). Relations 
(4) and (5) follow in the same way if we use (6b) and (6c), respectively. By addition we 
obtain (2*). This proves Stokes’s theorem for a surface §$ that can be represented 
simultaneously in the forms (6a), (6b), (6c). 

As in the proof of the divergence theorem, our result may be immediately extended 
to a surface S that can be decomposed into finitely many pieces, each of which is of 
the kind just considered. This covers most of the cases of practical interest. The proof 
in the case of a most general surface S satisfying the assumptions of the theorem would 
require a limit process; this is similar to the situation in the case of Green’s theorem 
in Sec. 10.4. o 


Green’s Theorem in the Plane as a Special Case of Stokes’s Theorem 


Let F = [F,, Fo] = Fyi+ Foj be a vector function that is continuously differentiable in a domain in the 
xy-plane containing a simply connected bounded closed region S whose boundary C is a piecewise smooth 
simple closed curve. Then, according to (1), 


OF 2 OF, 
(curl F) ¢ n = (curl F) *k = —— — —_. 
Ox oy 
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EXAMPLE 4 


=] 


Fig. 257. 
Example 4 


EXAMPLE 5 


Hence the formula in Stokes’s theorem now takes the form 


OF, OF, 

{i = = dA poPide + Fo dy). 
Ox oy Cc 

Ss 


This shows that Green’s theorem in the plane (Sec. 10.4) is a special case of Stokes’s theorem (which we needed 
in the proof of the latter!). (3) 


Evaluation of a Line Integral by Stokes’s Theorem 


Evaluate f C F ¢r’ds, where C is the circle xe + y? = 4, z = —3, oriented counterclockwise as seen by a person 
standing at the origin, and, with respect to right-handed Cartesian coordinates, 


F=l[y, x23, —zy9] = yi + 224j — gk. 


Solution. As a surface S bounded by C we can take the plane circular disk xe + y? = 4 in the plane z = —3. 
Then n in Stokes’s theorem points in the positive z-direction; thus n = k. Hence (curl F) nis simply the component 
of curl F in the positive z-direction. Since F with z = —3 has the components Fy = y, F2 27x, F3 = 3y°, we 
thus obtain 


OF 2 OF, 
(curl F) ¢n = 21= 1 28. 
Ox oy 


Hence the integral over S in Stokes’s theorem equals —28 times the area 477 of the disk S. This yields the answer 
28 + 47 1127 ~ —352. Confirm this by direct calculation, which involves somewhat more work. B 


Physical Meaning of the Curl in Fluid Motion. Circulation 


Let S,,, be a circular disk of radius rp and center P bounded by the circle C,,, (Fig. 257), and let F(Q) = F(x, y, z) 
be a continuously differentiable vector function in a domain containing S,. Then by Stokes’s theorem and the 
mean value theorem for surface integrals (see Sec. 10.6), 


pr er’ds = | | (curl F) *n dA = (curl F) * n(P*)A,, 
C, 


To Ss 


To. 


where A,,, is the area of S,,, and P* is a suitable point of S;,,. This may be written in the form 


(curl F) * n(P*) = = ; Fer'ds. 


7 Cy, 


In the case of a fluid motion with velocity vector F = v, the integral 


ver'ds 
C,, 


‘0 


is called the circulation of the flow around C,,,. It measures the extent to which the corresponding fluid motion 
is a rotation around the circle C;,,. If we now let ro approach zero, we find 


. 1 
(8) (curl vy) *n(P) = lim — ; ver’ ds; 
770 Ar, C,, 
that is, the component of the curl in the positive normal direction can be regarded as the specific circulation 
(circulation per unit area) of the flow in the surface at the corresponding point. B 


Work Done in the Displacement around a Closed Curve 


Find the work done by the force F = 2xy3 sin zi + 3x2y? sinzj + x2y3 cos zk in the displacement around the 
curve of intersection of the paraboloid z = x” + y? and the cylinder (x — 1)? + y? = 1. 


Solution. — This work is given by the line integral in Stokes’s theorem. Now F = grad f, where f = xy sin z 
and curl (grad f) = 0 (see (2) in Sec. 9.9), so that (curl F) * n = 0 and the work is 0 by Stokes’s theorem. This 
agrees with the fact that the present field is conservative (definition in Sec. 9.7). 8 
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Stokes’s Theorem Applied to Path Independence 


We emphasized in Sec. 10.2 that the value of a line integral generally depends not only 
on the function to be integrated and on the two endpoints A and B of the path of integration 
C, but also on the particular choice of a path from A to B. In Theorem 3 of Sec. 10.2 we 
proved that if a line integral 


(9) | Fo ar = | rae + Reap + Fao 
Cc Cc 


(involving continuous Fy, Fo, F3 that have continuous first partial derivatives) is path 
independent in a domain D, then curl F = 0 in D. And we claimed in Sec. 10.2 that, 
conversely, curl F = 0 everywhere in D implies path independence of (9) in D provided D 
is simply connected. A proof of this needs Stokes’s theorem and can now be given as follows. 


Let C be any closed path in D. Since D is simply connected, we can find a surface S 
in D bounded by C. Stokes’s theorem applies and gives 


Cc 


piPide + Fo dy + F3 dz) = p Fer'as = | (cot) mas 


Cc 


for proper direction on C and normal vector n on S. Since curl F = 0 in D, the surface 
integral and hence the line integral are zero. This and Theorem 2 of Sec. 10.2 imply that 
the integral (9) is path independent in D. This completes the proof. ia 


PROBLEEM—SET 10-9 


1-10 


DIRECT INTEGRATION OF SURFACE 


Evaluate the surface integral 
the given F and S. Ss 


1, 


2: 


. F as in Prob. 1, z= xy OSx= 1, 


INTEGRALS 
| | (curl F) « n dA directly for 


F= [22, —x?, 0], S the rectangle with vertices (0, 0, 0), 
(1, 0, 0), (0, 4, 4), (1, 4, 4) 

F = [—13 siny, 3 sinhz, x], S the rectangle with vertices 
(0,0, 2), (4,0,2), (4, 77/2,2), (0, 7/2, 2) 


. F=[e*,e *cosy,e*siny], S: z= y?/2, 


-lSx51, 0Sye1 


0=y=4). 


Compare with Prob. 1. 


11. 


12. 


Stokes’s theorem not applicable. Evaluate $F or’ ds, 
Cc 

F= (x? + y*) I -y, x);'C: x? + y? = 1,z = 0, ori- 

ented clockwise. Why can Stokes’s theorem not be 

applied? What (false) result would it give? 


WRITING PROJECT. Grad, Div, Curl in 
Connection with Integrals. Make a list of ideas and 
results on this topic in this chapter. See whether you 
can rearrange or combine parts of your material. Then 
subdivide the material into 3-5 portions and work out 
the details of each portion. Include no proofs but simple 
typical examples of your own that lead to a better 


5. F = [z?, 3x, 0], S:0SxSa, 0SySa, understanding of the material. 

z=1 

— 713 _,3 ee 2 se 

CE ee Dee) de Ged 13-20| EVALUATION OF HF +r’ ds 
7. F = [e%,e%e"], Siz=x? OSx82, Ic 

OSyS)) Tina , 

yee 5 5 Calculate this line integral by Stokes’s theorem for the 

8. F=[e x,y], Siz=Vxo +", given F and C. Assume the Cartesian coordinates to be 


10. 


y20, 083z5h 


. Verify Stokes’s theorem for F and S in Prob. 5. 


Verify Stokes’s theorem for F and S in Prob. 6. 


right-handed and the z-component of the surface normal to 
be nonnegative. 


13. 


F = [—5y, 4x, z], C the circle x2 + y? = 16, z=4 


Chapter 10 Review Questions and Problems 


14, F = [23, x3, y3], Cthecirclex = 2, y2+ 22 =9 

15. F= [y’, x, z + x] around the triangle with vertices 
(0, 0, 0), C1, 0, 0), C1, 1, 0) 

16. F = [e”, 0, e”], C as in Prob. 15 


17. F = [0, 2, 0], C the boundary curve of the cylinder 
xr+y=1x20, y20, 0S7S1 


1. State from memory how to evaluate a line integral. 
A surface integral. 

2. What is path independence of a line integral? What is 
its physical meaning and importance? 

3. What line integrals can be converted to surface 
integrals and conversely? How? 

4. What surface integrals can be converted to volume 
integrals? How? 

5. What role did the gradient play in this chapter? The 
curl? State the definitions from memory. 


6. What are typical applications of line integrals? Of 
surface integrals? 


7. Where did orientation of a surface play a role? Explain. 

8. State the divergence theorem from memory. Give 
applications. 

9. In some line and surface integrals we started from 
vector functions, but integrands were scalar functions. 
How and why did we proceed in this way? 

10. State Laplace’s equation. Explain its physical impor- 
tance. Summarize our discussion on harmonic functions. 


11-20) LINE INTEGRALS (WORK INTEGRALS) 


Evaluate | F(x) ¢ dr for given F and C by the method that 
c 
seems most suitable. Remember that if F is a force, the 


integral gives the work done in the displacement along C. 

Show details. 

HF = px", —4y"], C the straight-line segment from 
(4, 2) to (—6, 10) 

12. F = [ycos xy, x cos xy, e*], C the straight-line segment 
from (77, 1, 0) to (4, 7, 1) 

13. F = [y?, 2xy + 5sinx,0], C the 
OSxS7/2, 0Sy82, z=0 

14. F = [—y?,x° + e7%, 0], C the circle x2 + y? = 25, 
g=2 

15. F = [23 e™, e~™], Cx? + Oy? =9, 2 =x? 

16. F = bx, y’, yx], C the helix r = [2 cos f, 2 sin ¢, 3¢] 
from (2, 0, 0) to (—2, 0, 377) 


boundary of 
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18. F = [—y, 2z, 0], C the boundary curve of y? + 2? = 4, 
z=20;, 0OSxSh 

19. F = [z, e*, 0], C the boundary curve of the portion of 
the cone z = x2 + y?, x20; y20, OSzS1 

20. F = [0, cos x, 0], C the boundary curve of y” += 4, 
y20, 720, OSxS7 
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17. F = [9z, 5x,3y], C the ellipse x7+y?=9, 
Z=xt+2 

18. F = [sin 7ry, cos 7x, sin 7x], C the boundary curve of 
OSx21, 0S y3=2, z=x 


19. F = [z, 2y, x], C the helix r = [cost, sint, t] from 
(1, 0, 0) to (1, 0, 277) 
20. F = [ze**, 2 sinh 2y, xe**], C the parabola y =x, 


z=x7,-1lSx=1 


21-25| DOUBLE INTEGRALS, 


CENTER OF GRAVITY 
Find the coordinates x,y of the center of gravity of a 


mass of density f(x, y) in the region R. Sketch R, show 
details. 


21. f = xy, R the triangle with vertices (0, 0), (2, 0), (2, 2) 

22. f=x2 ty, Rix? t+y?Sa*, yZ=0 

23. f=x7, Ri: -lSxS2, x27 SySx+2. Why is 
x >0? 

24.f=1, R:0SyS1-x* 

25. f=ky, k> 0, arbitrary, 0 S y S 1 — x2, 
QOS%x*21 


26. Why are x and y in Prob. 25 independent of k? 


27-35 | SURFACE INTEGRALS | | FendaA. 


Ss 
DIVERGENCE THEOREM 

Evaluate the integral diectly or, if possible, by the divergence 
theorem. Show details. 
27. F = [ax, by, cz], S the sphere x2 + y? + 2? = 36 
28. F=[x + y%,y + 22,24 x7], S the ellipsoid with 

semi-axes of lengths a, b, c 
29. F = [y + z, 20y, 2z3}, S the surface of OS x S32, 
30. F=[l, 1,1], Six? + y?4+ 422 =4, 220 
31. F = [e*, eY, e*], 

yleay lal=1 


IIA 


S the surface of the box |x| 
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32. F = [y?, x7, 27], S the portion of the paraboloid 34, F = [x,xy,z], S the boundary of x7+y?<1, 


zg=x2t y’, z=9 OS2=5 
33. F= by, x7,27], Sir=[u,u2,v], 0SuS2, 35. F=[x+zy+z,x+ y], S the sphere of radius 3 
—23v052 with center 0 


SUMMARY-OF-CHAPTER-LO 


Vector Integral Calculus. Integral Theorems 


Chapter 9 extended differential calculus to vectors, that is, to vector functions v(x, y, Zz) 
or v(t). Similarly, Chapter 10 extends integral calculus to vector functions. This 
involves line integrals (Sec. 10.1), double integrals (Sec. 10.3), surface integrals (Sec. 
10.6), and triple integrals (Sec. 10.7) and the three “big” theorems for transforming 
these integrals into one another, the theorems of Green (Sec. 10.4), Gauss (Sec. 10.7), 
and Stokes (Sec. 10.9). 

The analog of the definite integral of calculus is the line integral (Sec. 10.1) 


dr 
dt dt 


b 
(1) | 0) +e = | @ae+ Peay + Fae) = | F(r(d) ° 
Cc Cc a 


where C: r(t) = [x(), yO), z<O] = xi + yO] + z<OK (a St S bd) is acurve in 
space (or in the plane). Physically, (1) may represent the work done by a (variable) 
force in a displacement. Other kinds of line integrals and their applications are also 
discussed in Sec. 10.1. 

Independence of path of a line integral in a domain D means that the integral 
of a given function over any path C with endpoints P and Q has the same value for 
all paths from P to Q that lie in D; here P and Q are fixed. An integral (1) is 
independent of path in D if and only if the differential form Fy dx + Fo dy + F3 dz 
with continuous Fy, Fo, F’3 is exact in D (Sec. 10.2). Also, if curl F = 0, where 
F = [F\, Fo, F3], has continuous first partial derivatives in a simply connected 
domain D, then the integral (1) is independent of path in D (Sec. 10.2). 

Integral Theorems. The formula of Green’s theorem in the plane (Sec. 10.4) 


OFs oOFy 
R 


Ox oy Cc 


transforms double integrals over a region R in the xy-plane into line integrals over 
the boundary curve C of R and conversely. For other forms of (2) see Sec. 10.4. 
Similarly, the formula of the divergence theorem of Gauss (Sec. 10.7) 


(3) || fac FdvV= [|r endA 
T Ss 
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transforms triple integrals over a region T in space into surface integrals over the 
boundary surface S$ of T, and conversely. Formula (3) implies Green’s formulas 


2 0g 
(4) [|Jorvee + vr ve av = | [ro aa, 
T Ss 

0 0 

(5) [[forvee - evnav = [[(v g "an 

n on 
T Ss 
Finally, the formula of Stokes’s theorem (Sec. 10.9) 
(6) | |ocon F)-ndA = pFer') ds 
‘S C 


transforms surface integrals over a surface S into line integrals over the boundary 
curve C of S and conversely. 


PART C 


Fourier Analysis. 
Partial 
Differential 

’ Equations (PDEs) 


CHAPTER 11 Fourier Analysis 
CHAPTER 12 Partial Differential Equations (PDEs) 


Chapter 11 and Chapter 12 are directly related to each other in that Fourier analysis has 
its most important applications in modeling and solving partial differential equations 
(PDEs) related to boundary and initial value problems of mechanics, heat flow, 
electrostatics, and other fields. However, the study of PDEs is a study in its own right. 
Indeed, PDEs are the subject of much ongoing research. 


Fourier analysis allows us to model periodic phenomena which appear frequently in 
engineering and elsewhere—think of rotating parts of machines, alternating electric currents 
or the motion of planets. Related period functions may be complicated. Now, the ingeneous 
idea of Fourier analysis is to represent complicated functions in terms of simple periodic 
functions, namely cosines and sines. The representations will be infinite series called 
Fourier series.'! This idea can be generalized to more general series (see Sec. 11.5) and 
to integral representations (see Sec. 11.7). 


The discovery of Fourier series had a huge impetus on applied mathematics as well as on 
mathematics as a whole. Indeed, its influence on the concept of a function, on integration 
theory, on convergence theory, and other theories of mathematics has been substantial 
(see [GenRef7] in App. 1). 

Chapter 12 deals with the most important partial differential equations (PDEs) of physics 
and engineering, such as the wave equation, the heat equation, and the Laplace equation. 
These equations can model a vibrating string/membrane, temperatures on a bar, and 
electrostatic potentials, respectively. PDEs are very important in many areas of physics 
and engineering and have many more applications than ODEs. 


1JEAN-BAPTISTE JOSEPH FOURIER (1768-1830), French physicist and mathematician, lived and taught 
in Paris, accompanied Napoléon in the Egyptian War, and was later made prefect of Grenoble. The beginnings 
on Fourier series can be found in works by Euler and by Daniel Bernoulli, but it was Fourier who employed 
them in a systematic and general manner in his main work, Théorie analytique de la chaleur (Analytic Theory 
of Heat, Paris, 1822), in which he developed the theory of heat conduction (heat equation; see Sec. 12.5), making 
these series a most important tool in applied mathematics. oa 


CHAPTER | ] 


Fourier Analysis 


This chapter on Fourier analysis covers three broad areas: Fourier series in Secs. 11.1-11.4, 
more general orthonormal series called Sturm—Liouville expansions in Secs. 11.5 and 11.6 
and Fourier integrals and transforms in Secs. 11.7—11.9. 

The central starting point of Fourier analysis is Fourier series. They are infinite series 
designed to represent general periodic functions in terms of simple ones, namely, cosines 
and sines. This trigonometric system is orthogonal, allowing the computation of the 
coefficients of the Fourier series by use of the well-known Euler formulas, as shown in 
Sec. 11.1. Fourier series are very important to the engineer and physicist because they 
allow the solution of ODEs in connection with forced oscillations (Sec. 11.3) and the 
approximation of periodic functions (Sec. 11.4). Moreover, applications of Fourier analysis 
to PDEs are given in Chap. 12. Fourier series are, in a certain sense, more universal than 
the familiar Taylor series in calculus because many discontinuous periodic functions that 
come up in applications can be developed in Fourier series but do not have Taylor series 
expansions. 

The underlying idea of the Fourier series can be extended in two important ways. We 
can replace the trigonometric system by other families of orthogonal functions, e.g., Bessel 
functions and obtain the Sturm—Liouville expansions. Note that related Secs. 11.5 and 
11.6 used to be part of Chap. 5 but, for greater readability and logical coherence, are now 
part of Chap. 11. The second expansion is applying Fourier series to nonperiodic 
phenomena and obtaining Fourier integrals and Fourier transforms. Both extensions have 
important applications to solving PDEs as will be shown in Chap. 12. 

In a digital age, the discrete Fourier transform plays an important role. Signals, such 
as voice or music, are sampled and analyzed for frequencies. An important algorithm, in 
this context, is the fast Fourier transform. This is discussed in Sec. 11.9. 

Note that the two extensions of Fourier series are independent of each other and may 
be studied in the order suggested in this chapter or by studying Fourier integrals and 
transforms first and then Sturm—Liouville expansions. 


Prerequisite: Elementary integral calculus (needed for Fourier coefficients). 
Sections that may be omitted in a shorter course: 11.4-11.9. 
References and Answers to Problems: App. | Part C, App. 2. 


11.1 Fourier Series 
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Fourier series are infinite series that represent periodic functions in terms of cosines and 
sines. As such, Fourier series are of greatest importance to the engineer and applied 
mathematician. To define Fourier series, we first need some background material. 
A function f(x) is called a periodic function if f(x) is defined for all real x, except 
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f(x) 


P 


Fig. 258. Periodic function of period p 


possibly at some points, and if there is some positive number p, called a period of f(x), 
such that 


(1) f(x + p) = f(x) for all x. 


(The function f(x) = tan x is a periodic function that is not defined for all real x but 
undefined for some points (more precisely, countably many points), that is x = +77/2, 
+377/2,---.) 

The graph of a periodic function has the characteristic that it can be obtained by periodic 
repetition of its graph in any interval of length p (Fig. 258). 

The smallest positive period is often called the fundamental period. (See Probs. 2-4.) 

Familiar periodic functions are the cosine, sine, tangent, and cotangent. Examples of 
functions that are not periodic are x, x2, x3, e”, cosh x, and In x, to mention just a few. 

If f(x) has period p, it also has the period 2p because (1) implies f(x + 2p) = 
F(x + p] + p) = f(x + p) = f(), etc.; thus for any integer n = 1, 2,3,---, 


(2) F(x + np) = f(x) for all x. 


Furthermore if f(x) and g(x) have period p, then af(x) + bg(x) with any constants a and 
b also has the period p. 


Our problem in the first few sections of this chapter will be the representation of various 
functions f(x) of period 277 in terms of the simple functions 


(3) il, cos x, sin x, cos 2x, sin 2x,-:-, cosmx, sin nx,:::. 
All these functions have the period 277. They form the so-called trigonometric system. 


Figure 259 shows the first few of them (except for the constant 1, which is periodic with 
any period). 


l ! | l | l ! | 
a SKA. AAUAAVEC 


cos x cos 2x cos 3x 
a SG veel 
sin x sin 2x sin 3x 


Fig. 259. Cosine and sine functions having the period 277 (the first few members of the 
trigonometric system (3), except for the constant 1) 
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The series to be obtained will be a trigonometric series, that is, a series of the form 


dg + ay cos x + by sin x + dg cos 2x + bo sin 2x + -:- 


4 = : 
4) =adot SS (dy, cos nx + by, sin nx). 
n=1 
do, 41, D1, dz, be,:+* are constants, called the coefficients of the series. We see that each 


term has the period 277. Hence if the coefficients are such that the series converges, its 
sum will be a function of period 277. 

Expressions such as (4) will occur frequently in Fourier analysis. To compare the 
expression on the right with that on the left, simply write the terms in the summation. 
Convergence of one side implies convergence of the other and the sums will be the 
same. 

Now suppose that f(x) is a given function of period 277 and is such that it can be 
represented by a series (4), that is, (4) converges and, moreover, has the sum f(x). Then, 
using the equality sign, we write 


(5) f(x) = ao + S (an cos nx + by sin nx) 


n=1 


and call (5) the Fourier series of f(x). We shall prove that in this case the coefficients 
of (5) are the so-called Fourier coefficients of f(x), given by the Euler formulas 


1 T 
(0) a =-+| f(x) dx 


2 _ 
1 7 
(6) (a) GQ, = =| f(x) cos nx dx n= 1,2,-°° 
if 
(b) | F(x) sin nx dx n=1,2,-°:. 


The name “Fourier series” is sometimes also used in the exceptional case that (5) with 
coefficients (6) does not converge or does not have the sum f(x)—this may happen but 
is merely of theoretical interest. (For Euler see footnote 4 in Sec. 2.5.) 


A Basic Example 


Before we derive the Euler formulas (6), let us consider how (5) and (6) are applied in 
this important basic example. Be fully alert, as the way we approach and solve this 
example will be the technique you will use for other functions. Note that the integration 
is a little bit different from what you are familiar with in calculus because of the n. Do 
not just routinely use your software but try to get a good understanding and make 
observations: How are continuous functions (cosines and sines) able to represent a given 
discontinuous function? How does the quality of the approximation increase if you take 
more and more terms of the series? Why are the approximating functions, called the 
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partial sums of the series, in this example always zero at 0 and 77? Why is the factor 
1/n (obtained in the integration) important? 


EXAMPLE 1 Periodic Rectangular Wave (Fig. 260) 


Find the Fourier coefficients of the periodic function f(x) in Fig. 260. The formula is 


—k if -mw<x<0 
(7) f(x) -{ and f(x + 27r) = f(x). 
kif O<x<7 


Functions of this kind occur as external forces acting on mechanical systems, electromotive forces in electric 
circuits, etc. (The value of f(x) at a single point does not affect the integral; hence we can leave f(x) undefined 
at x = O and x = +7.) 


Solution. From (6.0) we obtain a9 = 0. This can also be seen without integration, since the area under the 
curve of f(x) between —7 and 77 (taken with a minus sign where f(x) is negative) is zero. From (6a) we obtain 
the coefficients a1, dg,--- of the cosine terms. Since f(x) is given by two expressions, the integrals from —7 
to 7 split into two integrals: 


1 7 1 0 T 
an = =| F(x) cos nx dx = = | (—k) cos nx dx + | Koos neds] 
0 


7 —T 
. FA TT 
il | sin nx _ , Sin nx 0 
t 
7 n ae np 
because sin nx = 0 at —77, 0, and 7 for all n = 1, 2,---. We see that all these cosine coefficients are zero. That 


is, the Fourier series of (7) has no cosine terms, just sine terms, it is a Fourier sine series with coefficients 
by, be, +++ obtained from (6b); 


1 7 1 0 7 
by = =| J (x) sin nx dx = ail (—k) sin nx dx + | k sin nx as] 
0 


=a —T 
TT 
cos nx 
0 


| cos nx |° 
k 


n 
-7T 


Since cos (—a@) = cos a and cos 0 = 1, this yields 


k 2k 
bn aT [cos 0 — cos (—n7r) — cosn7r + cos 0] ar (1 — cos n7r). 


Now, cos 7 = —1, cos 277 = 1, cos 377 = —1, etce.; in general, 


—1 for oddn, 2 for oddn, 
cos ni = and thus 1 — cos nt = 
1 for even n, 0 for even n. 


Hence the Fourier coefficients b,, of our function are 


es 5 joe iy ie oe 
La be = 0, BaF 4 = 0, 5 Be 
f(x) 
wa 0 1 20 x 


Fig. 260. Given function f(x) (Periodic reactangular wave) 
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Since the a,, are zero, the Fourier series of f(x) is 


4k ts 1. 
(8) a sin + x sin 3x + 5 sin5x + ---). 
The partial sums are 
4k. 4k ( ie 
S; = =sinx, So = —| sinx + Z sin 3x }. etc. 
7 7 3 


Their graphs in Fig. 261 seem to indicate that the series is convergent and has the sum f(x), the given function. 
We notice that at x = 0 and x = 77, the points of discontinuity of f(x), all partial sums have the value zero, the 
arithmetic mean of the limits —k and k of our function, at these points. This is typical. 

Furthermore, assuming that f(x) is the sum of the series and setting x = 77/2, we have 


a 4k 1. tol ; 
(§)-1-8(-f4-+-) 


1 1 1 T 
3 5 #7 4° 


Thus 


This is a famous result obtained by Leibniz in 1673 from geometric considerations. It illustrates that the values 
of various series with constant terms can be obtained by evaluating Fourier series at specific points. fe 


Fig. 261. First three partial sums of the corresponding Fourier series 
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THEOREM 1 


PROOF 


Derivation of the Euler Formulas (6) 


The key to the Euler formulas (6) is the orthogonality of (3), a concept of basic importance, 
as follows. Here we generalize the concept of inner product (Sec. 9.3) to functions. 


Orthogonality of the Trigonometric System (3) 


The trigonometric system (3) is orthogonal on the interval —7 S x = 7 (hence 
also on 0 S$ x S 27 or any other interval of length 277 because of periodicity); that 
is, the integral of the product of any two functions in (3) over that interval is 0, so 
that for any integers n and m, 


(a) | cos nx cos mx dx = 0 (n # m) 
(9) (b) | sin nx sin mx dx = 0 (n # m) 
(c) | sin nx cos mx dx = 0 (n # morn =m). 


=F 


This follows simply by transforming the integrands trigonometrically from products into 
sums. In (9a) and (9b), by (11) in App. A3.1, 


| cos nx cos mx dx = Al cos (n + m)x dx + Al cos (n — m)x dx 
T 1 T 1 7 
sin nx sin mx dx = o cos (n — m)x dx — 5 cos (n + m)x dx. 


Since m # n (integer!), the integrals on the right are all 0. Similarly, in (9c), for all integer 
m and n (without exception; do you see why?) 


| sin necos mx de = 3] sin(n + mac de + 5 sin(n —m)xdx=0+0. 


-7T -7T =7T 


Application of Theorem 1 to the Fourier Series (5) 
We prove (6.0). Integrating on both sides of (5) from —7 to 77, we get 


| f(x) dx = | ay + S (a, cos nx + by sin ns) | dx. 


-—7 -7 n=1 


We now assume that termwise integration is allowed. (We shall say in the proof of 
Theorem 2 when this is true.) Then we obtain 


| fora = ao dx + > (en cos mrad + by | 


-7 -7 n=1 -7 -7 


T 


sin nx ax) 
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THEOREM 2 


) 


ra 


f(1 - 0) 


f(1 +0) 
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The first term on the right equals 27rag. Integration shows that all the other integrals are 0. 
Hence division by 277 gives (6.0). 

We prove (6a). Multiplying (5) on both sides by cos mx with any fixed positive integer 
m and integrating from —77 to 77, we have 


(10) | F(x) cos mx dx = | a + Dy (dy, cos nx + by, sin nx) | cos mx dx. 


-—7 -—7 n=1 


We now integrate term by term. Then on the right we obtain an integral of ag cos mx, 
which is 0; an integral of a,, cos nx cos mx , which is a,,77 for n = m and 0 for n # m by 
(9a); and an integral of b, sin nx cos mx, which is 0 for all n and m by (9c). Hence the 
right side of (10) equals a,,7r. Division by 77 gives (6a) (with m instead of 7). 

We finally prove (6b). Multiplying (5) on both sides by sin mx with any fixed positive 
integer m and integrating from —77 to 77, we get 


T 


(11) | feysinmeac = | 


-7T -7 


c + > (dy, cos nx + by, sin nx) | sin mx dx. 


n=1 


Integrating term by term, we obtain on the right an integral of dg sin mx, which is 0; an 
integral of a, cos nx sin mx, which is 0 by (9c); and an integral of b, sin nx sin mx, which 
is byt if n = mand 0 if n # m, by (9b). This implies (6b) (with n denoted by m). This 
completes the proof of the Euler formulas (6) for the Fourier coefficients. a 


Convergence and Sum of a Fourier Series 


The class of functions that can be represented by Fourier series is surprisingly large and 
general. Sufficient conditions valid in most applications are as follows. 


Representation by a Fourier Series 


Let f(x) be periodic with period 277 and piecewise continuous (see Sec. 6.1) in the 
interval —7 = x S 7. Furthermore, let f(x) have a left-hand derivative and a right- 
hand derivative at each point of that interval. Then the Fourier series (5) of f(x) 
[with coefficients (6)] converges. Its sum is f(x), except at points Xy where f(x) is 
discontinuous. There the sum of the series is the average of the left- and right-hand 
limits” of f(x) at xo. 


?The left-hand limit of f(x) at x9 is defined as the limit of f(x) as x approaches xp from the left 
and is commonly denoted by f(x9 — 0). Thus 


f(x — 0) = iim f(% — A) as h = O through positive values. 


1 


Fig. 262. Left- and 
right-hand limits 


f(l — 0) = 1, 
fl+op=3 
of the function 


x ifx <1 
F(x) -{ 
x/2 ifx21 


The right-hand limit is denoted by f(xg + 0) and 
f(x + 0) = lim f(%o + h) as h — 0 through positive values. 


The left- and right-hand derivatives of f(x) at xo are defined as the limits of 


(x0 — h) — f(xo — 0) f(xo + h) — flxo + 0) 
= and = ; 


respectively, as h — 0 through positive values. Of course if f(x) is continuous at xg, the last term in 
both numerators is simply f(x9). 
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PROOF 


EXAMPLE 2 


We prove convergence, but only for a continuous function f(x) having continuous first 
and second derivatives. And we do not prove that the sum of the series is f(x) because 
these proofs are much more advanced; see, for instance, Ref. [C12] listed in App. 1. 
Integrating (6a) by parts, we obtain 


ral f' (x) sin nx dx. 


= oT aif 


f(x) sin nx 
nT 


ay = =| F(x) cos nx dx = 


=TT 


The first term on the right is zero. Another integration by parts gives 


7 
1 


i | ie (x) cos nx dx. 
no 


-—7T =a 


_ f' (x) cos nx 


n 
na 


The first term on the right is zero because of the periodicity and continuity of f’(x). Since 
f” is continuous in the interval of integration, we have 


If"@| <M 


for an appropriate constant M. Furthermore, |cos nx| = 1. It follows that 


i 2M 
|ay| = 2 < 2 | Max = 3° 
nN TT 


nom J n 


| f” (x) cos nx dx 


=a 


Similarly, byl <2M/ n? for all n. Hence the absolute value of each term of the Fourier 
series of f(x) is at most equal to the corresponding term of the series 
1 1 1 1 
Jag] + 2M (1 +14 | | -) 
92 92 3232 


which is convergent. Hence that Fourier series converges and the proof is complete. 
(Readers already familiar with uniform convergence will see that, by the Weierstrass 
test in Sec. 15.5, under our present assumptions the Fourier series converges uniformly, 
and our derivation of (6) by integrating term by term is then justified by Theorem 3 of 
Sec. 15.5.) ia 


Convergence at a Jump as Indicated in Theorem 2 


The rectangular wave in Example | has a jump at x = 0. Its left-hand limit there is —k and its right-hand limit 
is k (Fig. 261). Hence the average of these limits is 0. The Fourier series (8) of the wave does indeed converge 
to this value when x = 0 because then all its terms are 0. Similarly for the other jumps. This is in agreement 
with Theorem 2. 3] 


Summary. A Fourier series of a given function f(x) of period 277 is a series of the form 
(5) with coefficients given by the Euler formulas (6). Theorem 2 gives conditions that are 
sufficient for this series to converge and at each x to have the value f(x), except at 
discontinuities of f(x), where the series equals the arithmetic mean of the left-hand and 
right-hand limits of f(x) at that point. 
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PROBLEM SET 11-1 


1-5| PERIOD, FUNDAMENTAL PERIOD 


The fundamental period is the smallest positive period. Find 
it for 


1. cosx, sinx, cos2x, sin2x, cos7x, sin 77x, 
cos 27rx, sin 27x 
: 277x _ 2m7x 277nx 
2. cosnx, sinnx, cos =a sin kL” cos Ep 
_ 27Tnx 
sin i 


3. If f(x) and g(x) have period p, show that h(x) = 
af(x) + bg(x) (a, b, constant) has the period p. Thus 
all functions of period p form a vector space. 


4. Change of scale. If f(x) has period p, show that 
f(ax), a # 0, and f(x/b), b # 0, are periodic functions 
of x of periods p/a and bp, respectively. Give examples. 


5. Show that f = const is periodic with any period but has 
no fundamental period. 


6-10 | GRAPHS OF 277—-PERIODIC FUNCTIONS 


Sketch or graph f(x) which for —7 < x < 7 is given as 
follows. 


6. f(x) = |x| 
7. f(x) = |sinx|, f(x) = sin |x| 
8. f(y =e! fa) = le™| 


x if -—7r<x<0 
9. f(x) = 
7T—x if O<x<T7 
—cos?x if —7<x<0 
10. f(x) = 
cos’ x if O< x <7 


11. Calculus review. Review integration techniques for 
integrals as they are likely to arise from the Euler 
formulas, for instance, definite integrals of x cos nx, 


x? sin nx, e~2” cos nx, etc. 


12-21| FOURIER SERIES 


Find the Fourier series of the given function f(x), which is 
assumed to have the period 277. Show the details of your 
work. Sketch or graph the partial sums up to that including 
cos 5x and sin 5x. 


12. f(x) in Prob. 6 

13. f(x) in Prob. 9 

14. f(x) =x? (-7 <x<7) 
15. f(x) =x? (0<x< 27) 
16. 


17. 


18. 


19. 


20. 


21. 


22. 


23. 


24. 


1 
I e) 1 
1 

l 
— 16) 1 
1 
L 
I () T 


Tt 


CAS EXPERIMENT. Graphing. Write a program for 
graphing partial sums of the following series. Guess 
from the graph what f(x) the series may represent. 
Confirm or disprove your guess by using the Euler 
formulas. 


(a) 2(sin.x + 4 sin 3x + 3 sin 5x + --:) 


= 2(5 sin 2x + 4sin 4x oP é sin 6x ---) 


ies ee eee ey 
2 T = COS X T 9 COS IX Tt 25 COS DX T 
2 


(c) 37? + 4(cos x — z cos 2x + 5 cos 3x — ié cos 4x 
+--+) 

Discontinuities. Verify the last statement in Theorem 

2 for the discontinuities of f(x) in Prob. 21. 

CAS EXPERIMENT. Orthogonality. Integrate and 

graph the integral of the product cos mx cos nx (with 


various integer m and n of your choice) from —a to a 
as a function of a and conclude orthogonality of cos mx 
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25. 


and cos nx (m # n) fora = 77 from the graph. For what if f is continuous but f’ = df/dx is discontinuous, 1/n? 
m and n will you get orthogonality for a = 7/2, 7/3, if f and f’ are continuous but f” is discontinuous, etc. 
7/4? Other a? Extend the experiment to cos mx sin nx Try to verify this for examples. Try to prove it by 
and sin mx sin nx. integrating the Euler formulas by parts. What is the 
CAS EXPERIMENT. Order of Fourier Coefficients. practical significance of this? 


The order seems to be 1/n if fis discontinous, and 1/ n” 


11.2 Arbitrary Period. Even and Odd Functions. 
Half-Range Expansions 


We now expand our initial basic discussion of Fourier series. 


Orientation. This section concerns three topics: 


1. Transition from period 277 to any period 2L, for the function f, simply by a 
transformation of scale on the x-axis. 


2. Simplifications. Only cosine terms if f is even (“Fourier cosine series”). Only sine 
terms if f is odd (“Fourier sine series’’). 


3. Expansion of f given for 0 = x S L in two Fourier series, one having only cosine 
terms and the other only sine terms (“half-range expansions’). 


1. From Period 277 to Any Period p = 2L 


Clearly, periodic functions in applications may have any period, not just 277 as in the last 
section (chosen to have simple formulas). The notation p = 2L for the period is practical 
because L will be a length of a violin string in Sec. 12.2, of a rod in heat conduction in 
Sec. 12.5, and so on. 

The transition from period 277 to be period p = 2L is effected by a suitable change of 
scale, as follows. Let f(x) have period p = 2L. Then we can introduce a new variable v 
such that f(x), as a function of v, has period 277. If we set 


P 277 T 
(1) (a) x= Sap v, so that (b) v= P x= z* 


then v = +7 corresponds to x = +L. This means that f, as a function of v, has period 
277 and, therefore, a Fourier series of the form 


(2) F(x) =5(2 o) = dg + > (dy cos nv + by sin nv) 


n=1 


with coefficients obtained from (6) in the last section 
i" 42 if’ Az 
«= 7 | (Fe) a n= %| f( Ecos no ao 
i 2 oe 
by = =| (2 ») sin nv du. 


id 


(3) 
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We could use these formulas directly, but the change to x simplifies calculations. Since 


T ax 


(4) v= ay we have dv = — 
L EL 


and we integrate over x from —L to L. Consequently, we obtain for a function f(x) of 
period 2L the Fourier series 


(5) f@®=a+> («, cos a + Dp sin x) 


n=1 


with the Fourier coefficients of f(x) given by the Euler formulas (77/L in dx cancels 
1/77 in (3)) 


L 
1 
(0) «= 7 f(x) dx 


D2 
— Tf, 
1 L 
NTTX 
(6) (a) ad, = i] 7 cos we n=1,2,-°° 
1 u TH 
(b) m= + f(x) sin dx n=1,2,-. 
EN ib 


Just as in Sec. 11.1, we continue to call (5) with any coefficients a trigonometric series. 
And we can integrate from 0 to 2L or over any other interval of length p = 2L. 


Periodic Rectangular Wave 


Find the Fourier series of the function (Fig. 263) 


0 if -2<x<-1 
f=sk if -l<x< 1 p=2b=4, L=2. 
0 if I<x< 2 


Solution. From (6.0) we obtain ay = k/2 (verify!). From (6a) we obtain 


ies nx 2k | nt 
k cos dx sin —. 
“1 2 


1 f FQ) nTrx . 
Qn = — x) cos Ix 
emey der 2 


Thus a, = 0 if n is even and 


Qn = 2k/nw if n=1,5,9,-°°, Qn = —2k/nw if n= 3,7, 11,°°-. 


From (6b) we find that b, = 0 forn = 1, 2,---. Hence the Fourier series is a Fourier cosine series (that is, it 
has no sine terms) 


k 2k 
SQ) t cos — x cos 6 Cos x pense: ei 
2 
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f(x) 

— k 
7) 
k a2 2 x 
L~ , 
| ees ee 
2 -l ) 1 2 x 
Fig. 263. Example 1 Fig. 264. Example 2 


EXAMPLE 2 


EXAMPLE 3 


Periodic Rectangular Wave. Change of Scale 
Find the Fourier series of the function (Fig. 264) 


—k if 
S(x) = 


=1<a x= 0 
p=2L=4, L=2. 
kif OSD 


Solution. Since L = 2, we have in (3) v = 7x/2 and obtain from (8) in Sec. 11.1 with v instead of x, that is, 


Ak Lt, It. 
g(v) sinv + —sin3v + —sin5v +--- 

7 3 5 

the present Fourier series 
FQ) ( 3nd ST ) 
x sin —x + — sin —x + — sin —x 4 
3 2 5 2 

Confirm this by using (6) and integrating. @ 


Half-Wave Rectifier 


A sinusoidal voltage E sin wt, where f is time, is passed through a half-wave rectifier that clips the negative 
portion of the wave (Fig. 265). Find the Fourier series of the resulting periodic function 


0 if =-L<1<0, ar 
u(t) = p=2L , L 
Esinot if 0<t<L o a 


Solution. Since u = 0 when —L < t < 0, we obtain from (6.0), with ¢ instead of x, 


@ ae ; E 
dg = — E sin wt dt = — 
277 Jp T 


and from (6a), by using formula (11) in App. A3.1 with x = wt and y = nat, 


T/o T/w 
wo : wE . . 
m= =| E sin wt cos not dt = SE | [sin (1 + n) wt + sin (1 — n) at] dt. 
T Sy 277 Jy 
If n = 1, the integral on the right is zero, and if n = 2, 3,---, we readily obtain 
wE cos(1 + n)wt cos(1 — n)wt)7/e 
an 

27 | (+ njo d—-n)o I 
E ( cos (1 + n)7 +1 cos (1 — nj + ~) 
27 l+n l-n : 


If n is odd, this is equal to zero, and for even n we have 


E 2 2 2E 
An + (n = 2,4,°-:). 
27\l+n l-n” (n — 1)\(n + 1) 
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In a similar fashion we find from (6b) that b} = E/2 and b,, = 0 for n = 2, 3,---. Consequently, 


E E., 2E 1 1 
u(t) t sin wt cos 2wt + cos 4mt + -::}. 3 
T 2 w\1:3 a 


u(t) 


-nlo 0) nlo t 


Fig. 265. Half-wave rectifier 


2. Simplifications: Even and Odd Functions 


If f(x) is an even function, that is, f(—x) = f(x) (see Fig. 266), its Fourier series (5) 
reduces to a Fourier cosine series 


R 


(5*) ff) = ao + S Ay, COS ee (f even) 


n=1 


Fig. 266. 
Even function 
with coefficients (note: integration from 0 to L only!) 
L ii 


(6%) w= | sores a= P| Se eos 7 de w= 1,2, 


R 


If f(x) is an odd function, that is, f(—x) = —f(x) (see Fig. 267), its Fourier series (5) 


Fig. 267. A 5 
8 reduces to a Fourier sine series 


Odd function 
(5**) f@% = y by, sin ee (f odd) 
n=1 


with coefficients 
2 P 1. 
NIX 
(6**) by = “ll F(x) sin |, 


These formulas follow from (5) and (6) by remembering from calculus that the definite 
integral gives the net area (= area above the axis minus area below the axis) under the 
curve of a function between the limits of integration. This implies 


L L 
(a) | g(x) dx = 2 | g(x) dx for even g 
(7) =L 0 
L 
(b) | h(x) dx = 0 for odd h 
-L 


Formula (7b) implies the reduction to the cosine series (even f makes f(x) sin (n7rx/L) odd 
since sin is odd) and to the sine series (odd f makes f(x) cos (n7rx/L) odd since cos is even). 
Similarly, (7a) reduces the integrals in (6*) and (6**) to integrals from 0 to L. These reductions 
are obvious from the graphs of an even and an odd function. (Give a formal proof.) 
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EXAMPLE 4 


THEOREM 1 


EXAMPLE 5 


Summary 


Even Function of Period 27. If fis even and L = 77, then 


foe} 
S(x) = a9 + Ss Ay, COS NX 
n=1 
with coefficients 


T 


«= =| Fx) dx, n= =| F(x) cos nx dx, n= 1,2,++- 
0 0 


Odd Function of Period 277. If fis odd and L = 7, then 


with coefficients 


| F(x) sin nx dx, n=1,2,-::. 
0 


Fourier Cosine and Sine Series 


The rectangular wave in Example | is even. Hence it follows without calculation that its Fourier series is a 
Fourier cosine series, the b, are all zero. Similarly, it follows that the Fourier series of the odd function in 
Example 2 is a Fourier sine series. 

In Example 3 you can see that the Fourier cosine series represents u(t) — E/7 — 35 sin wt. Can you prove 
that this is an even function? 


Further simplifications result from the following property, whose very simple proof is left 
to the student. 


Sum and Scalar Multiple 


The Fourier coefficients of a sum f, + fg are the sums of the corresponding Fourier 


coefficients of f, and fa. 
The Fourier coefficients of cf are c times the corresponding Fourier coefficients of f. 


Sawtooth Wave 
Find the Fourier series of the function (Fig. 268) 


f@m=xt+q0 if -TW<x<T7 and f(x + 277) = f(x). 


f(x) 


—% 1 x 


Fig. 268. The function f(x). Sawtooth wave 
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l 
- 0 1 x 


Fig. 269. Partial sums $1, S2, 53, Soo in Example 5 


Solution. We have f = f, + fo, where fy = x and fy = 77. The Fourier coefficients of fy are zero, except for 
the first one (the constant term), which is 77. Hence, by Theorem 1, the Fourier coefficients ay, by, are those of 
fi, except for do, which is 77. Since fy is odd, a, = 0 for n = 1, 2,---, and 


T 


oa 2 
by = =| fi@) sin nx dx = =| x sin nx dx. 
0 0 


Integrating by parts, we obtain 


2f—xeosnx|” 1 [7 2 
by + cos nx dx | = ——cosnT. 
7 n n n 
0 0 
Hence b, = 2, bo 2 bg 2 ba 2 ++, and the Fourier series of f(x) is 
: : L , ie : 
SQ) = 7 + 2\| sinx 5 sin 2x +4 ; sin 3x tres], (Fig. 269) Mi 


3. Half-Range Expansions 


Half-range expansions are Fourier series. The idea is simple and useful. Figure 270 
explains it. We want to represent f(x) in Fig. 270.0 by a Fourier series, where f(x) 
may be the shape of a distorted violin string or the temperature in a metal bar of length 
L, for example. (Corresponding problems will be discussed in Chap. 12.) Now comes 
the idea. 

We could extend f(x) as a function of period LZ and develop the extended function into 
a Fourier series. But this series would, in general, contain both cosine and sine terms. We 
can do better and get simpler series. Indeed, for our given f we can calculate Fourier 
coefficients from (6*) or from (6**). And we have a choice and can take what seems 
more practical. If we use (6*), we get (5*). This is the even periodic extension /; of f 
in Fig. 270a. If we choose (6**) instead, we get (5**), the odd periodic extension /2 of 
f in Fig. 270b. 

Both extensions have period 2L. This motivates the name half-range expansions: / is 
given (and of physical interest) only on half the range, that is, on half the interval of 
periodicity of length 2L. 

Let us illustrate these ideas with an example that we shall also need in Chap. 12. 
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f(x) | OS 


_—— 
L «x 


(0) The given function f(x) 


=f; oa x 


(a) f(x) continued as an even periodic function of period 2L 


(b) f(x) continued as an odd periodic function of period 2Z 
Fig. 270. Even and odd extensions of period 2L 


EXAMPLE 6 “Triangle” and Its Half-Range Expansions 


hk Find the two half-range expansions of the function (Fig. 271) 
2k - L 
=x if O<x< 5 


Fig. 271. The given fx) = 
function in Example 6 at -») if a= wei. 


Solution. (a) Even periodic extension. From (6*) we obtain 


1p2k p%? 2k [” k 
do x dx 4 (L — x) dx ‘ 
L 0 L/2 2 
272k (¢? ont 2k [© nT 
ay, = —|— xX COS x dx + (L — x) cos x dx}. 
iL IL L L/2 L 


We consider ay. For the first integral we obtain by integration by parts 


L/2 L/2 L/2 
i ni Lx . nt i L . nr 
X COS x dx sin be sin 
L nt 
0 


Similarly, for the second integral we obtain 


s nt L _ nw 
(L — x) cos x dx (L— x) sin x 
L/2 L L 


iE; L\ | at Lv? nt 
0 L sin 33 \ COS na — cos — }. 
2 2 n° 2 
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We insert these two results into the formula for a. The sine terms cancel and so does a factor L?. This gives 


4k nT 
An 2 cos = cos nv — 1}. 


Thus, 
dg = —16k/(2?7), ag. = —16k/(677?), ay. = —16k/ (10? 27), - -- 


and a, = Oifn # 2,6, 10, 14,---. Hence the first half-range expansion of f(x) is (Fig. 272a) 


k (5 QT 1 67r ) 


3 ae OS xT 


FO) 22 8 6 L 


This Fourier cosine series represents the even periodic extension of the given function f(x), of period 2L. 
(b) Odd periodic extension. Similarly, from (6**) we obtain 


5 by => Sin —. 
(5) na 


Hence the other half-range expansion of f(x) is (Fig. 272b) 


f) 


sin x sin xT sin x 


“(2 7 1 | 30 1 | Sa ~) 
Pop 3 L ez L 


a2 


The series represents the odd periodic extension of f(x), of period 2L. 
Basic applications of these results will be shown in Secs. 12.3 and 12.5. fei] 


-L 0) L x 


(a) Even extension 


(b) Odd extension 


Fig. 272. Periodic extensions of f(x) in Example 6 


PROBLEEM—SET 11.2 


1-7 


EVEN AND ODD FUNCTIONS 8-17| FOURIER SERIES FOR PERIOD p = 2L 


Are the following functions even or odd or neither even nor Is the given function even or odd or neither even nor 


odd? 


1. e”, 


odd? Find its Fourier series. Show details of your 


x? cosnx, x2 tan 7x, sinhx — cosh x work. 


2. sin?x, sin (x), Inx, af? +1), xcotx 


. Sums and products of even functions 8. 


. Sums and products of odd functions 


. Product of an odd times an even function 


3. 
4 
5. Absolute values of odd functions 
6. 
7 


. Find all functions that are both even and odd. i) 


(oe) 
2: 
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10. 


11. 
12. 
13. 


14. 
15. 


16. 
17. 


18. 


19. 


20. 


21. 


_4N- 


fd =x? (-1<x<1), p=2 


fQ) =1—x7/4 (-2<x<2) p=4 
1 
2 
i | 
mil 1 
2 2 
F(x) = cos 7x (-4<x<)b, p=l 
x 
2 
= 1 
a 
2 
f(y) =alx] (-l<x<1l, p=2 
-1 1 
Rectifier. Find the Fourier series of the function 


obtained by passing the voltage u(t) = Vocos 1007 
through a half-wave rectifier that clips the negative 
half-waves. 


Trigonometric Identities. Show that the familiar 
identities cos® x = 3 cos x + 4 cos 3x and sin? x = 3 
sin x — % sin 3x can be interpreted as Fourier series 


expansions. Develop cos* x, 


Numeric Values. Using Prob. 11, show that 1 + 4 + 
Pag.7 dyad 
9 ob Té tes 67. 


CAS PROJECT. Fourier Series of 2L-Periodic 
Functions. (a) Write a program for obtaining partial 
sums of a Fourier series (5). 


22. 
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(b) Apply the program to Probs. 8-11, graphing the first 
few partial sums of each of the four series on common 
axes. Choose the first five or more partial sums until 
they approximate the given function reasonably well. 
Compare and comment. 


Obtain the Fourier series in Prob. 8 from that in 
Prob. 17. 


23-29 


HALF-RANGE EXPANSIONS 


Find (a) the Fourier cosine series, (b) the Fourier sine series. 
Sketch f(x) and its two periodic extensions. Show the 


details. 
23. 3 
i 
4 
24, iL 
| 
2 4 
25. 5 
“| a 
26. x 
5 
! 
1 T 
2 
yy 
2 
l 
1 T 
2 
28. tL 
A 
L 


29. f(x) = sinx (0 <x < 77) 


30. 


Obtain the solution to Prob. 26 from that of 
Prob. 27. 
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11.3. Forced Oscillations 


EXAMPLE 1 


Fourier series have important applications for both ODEs and PDEs. In this section we 
shall focus on ODEs and cover similar applications for PDEs in Chap. 12. All these 
applications will show our indebtedness to Euler’s and Fourier’s ingenious idea of splitting 
up periodic functions into the simplest ones possible. 

From Sec. 2.8 we know that forced oscillations of a body of mass m on a spring of 
modulus k are governed by the ODE 


(1) my" + cy’ + ky = r(f) 


where y = y(t) is the displacement from rest, c the damping constant, k the spring constant 
(spring modulus), and r(f) the external force depending on time ¢. Figure 274 shows the 
model and Fig. 275 its electrical analog, an RLC-circuit governed by 


(1*) LI" + RI! + =i = E' (t) (Sec. 2.9). 


We consider (1). If r(f) is a sine or cosine function and if there is damping (c > 0), 
then the steady-state solution is a harmonic oscillation with frequency equal to that of r(d). 
However, if r(f) is not a pure sine or cosine function but is any other periodic function, 
then the steady-state solution will be a superposition of harmonic oscillations with 
frequencies equal to that of r(f) and integer multiples of these frequencies. And if one of 
these frequencies is close to the (practical) resonant frequency of the vibrating system (see 
Sec. 2.8), then the corresponding oscillation may be the dominant part of the response of 
the system to the external force. This is what the use of Fourier series will show us. Of 
course, this is quite surprising to an observer unfamiliar with Fourier series, which are 
highly important in the study of vibrating systems and resonance. Let us discuss the entire 
situation in terms of a typical example. 


Cc 
Spring 
R L 
External Mesa 
force r(t) a ; 
§) Dashpot 
} E(t) 
Fig. 274. Vibrating system Fig. 275. Electrical analog of the system 
under consideration in Fig. 274 (RLC-circuit) 


Forced Oscillations under a Nonsinusoidal Periodic Driving Force 


In (1), let m = 1 (g), c = 0.05 (g/sec), and k = 25 (g/sec”), so that (1) becomes 


(2) y” + 0.05y’ + 25y = r(t) 
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1/2 
Fig. 276. Force in Example 1 
where r(ft) is measured in g - cm/sec”. Let (Fig. 276) 


7 
ae if -m<r<0O, 


rt) = r(t + 27) = r(t). 


if 0<t<7, 


Find the steady-state solution y(f). 


Solution. We represent r(t) by a Fourier series, finding 


4 1 1 
(3) r(t (cos t cos 3t 4 cos 5t + ~). 
0 7 37 5° 
Then we consider the ODE 
” ' 4 
(4) y +0.05y + 25y cos nt (n= 1,3;+++) 


whose right side is a single term of the series (3). From Sec. 2.8 we know that the steady-state solution y, (1) 
of (4) is of the form 


(5) Yn = Ay cos nt + By sin nt. 


By substituting this into (4) we find that 


4(25 — n? 0.2 
( n°) a 


ao. tea. where Dy, = (25 — ny? + (0.05n). 
n nr 


(6) An 


Since the ODE (2) is linear, we may expect the steady-state solution to be 
7) y=yitys+ ys to 


where y,, is given by (5) and (6). In fact, this follows readily by substituting (7) into (2) and using the Fourier 
series of r(t), provided that termwise differentiation of (7) is permissible. (Readers already familiar with the 
notion of uniform convergence [Sec. 15.5] may prove that (7) may be differentiated term by term.) 

From (6) we find that the amplitude of (5) is (a factor VD,, cancels out) 


4 
24 2 
NO aD, 


Values of the first few amplitudes are 
Cy, = 0.0531 C3 = 0.0088 C5 = 0.2037 C7 = 0.0011 Cy = 0.0003. 


Figure 277 shows the input (multiplied by 0.1) and the output. For n = 5 the quantity D, is very small, the 
denominator of Cs is small, and Cs is so large that y5 is the dominating term in (7). Hence the output is almost 
a harmonic oscillation of five times the frequency of the driving force, a little distorted due to the term y;, whose 
amplitude is about 25% of that of ys. You could make the situation still more extreme by decreasing the damping 
constant c. Try it. 
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Output 


Fig. 277. 


1. Coefficients C,,. Derive the formula for C, from A, 
and By. 

2. Change of spring and damping. In Example 1, what 
happens to the amplitudes C,, if we take a stiffer spring, 
say, of k = 49? If we increase the damping? 

3. Phase shift. Explain the role of the B,,’s. What happens 
if we let c—0? 

4. Differentiation of input. In Example 1, what happens 
if we replace r(f) with its derivative, the rectangular wave? 
What is the ratio of the new C), to the old ones? 

5. Sign of coefficients. Some of the A,, in Example | are 
positive, some negative. All B, are positive. Is this 
physically understandable? 


6-11) GENERAL SOLUTION 


Find a general solution of the ODE y” + wy = r(¢) with 
r(t) as given. Show the details of your work. 
6. r(t) = sin at + sin Bt, wo # a”, B? 
7. r(t) = sint,w = 0.5, 0.9, 1.1, 1.5, 10 
8. Rectifier. r(t) = 77/4 |cos ¢| if -7 <t< 7 and 
r(t + 277) = r(t), |o| # 0,2, 4,--- 
9. What kind of solution is excluded in Prob. 8 by 
|w| # 0,2, 4,---2 
10. Rectifier. r(f) = 77/4 |sin ¢| if 0 < t < 27 and 
r(t + 27) = r(t), lol # 0,2, 4,--- 


-l if -7w<t<0 
11. r(f) = 
1 if 
12. CAS Program. Write a program for solving the ODE 


just considered and for jointly graphing input and output 
of an initial value problem involving that ODE. Apply 


lw| #1,3,5,-°° 
0<t<7, 


Input and steady-state output in Example 1 


PROBLEM SET 11-3 


the program to Probs. 7 and 11 with initial values of your 
choice. 


13-16 | STEADY-STATE DAMPED OSCILLATIONS 


Find the steady-state oscillations of y” + cy’ + y = r(t) 
with c > 0 and r(f) as given. Note that the spring constant 
is k = 1. Show the details. In Probs. 14-16 sketch r(f). 


N 
13. r() = Yan cos nt + by, sin nt) 
n=1 
—-1 if -7<t<0 
14. r(t) = and r(t + 277) = r(t) 
1 if O<t<7 


15. r(t) = t(7? — 1?) if -m<t<7 and 
r(t + 2m) = r() 
16. r(f) = 
tif -—m/2<t<7/2 
{ and r(t + 277) = r(t) 


am-tif w/2<t<3m/2 


17-19} RLC-CIRCUIT 

Find the steady-state current /(f) in the RLC-circuit in 
Fig. 275, where R = 100, L = 1H, C= 10~1! Fand with 
E(t) V as follows and periodic with period 277. Graph or 
sketch the first four partial sums. Note that the coefficients 


of the solution decrease rapidly. Hint. Remember that the 
ODE contains E'(r), not E(t), cf. Sec. 2.9. 


—50r7 if 
507 if 


-7<t<0 


£0 ={ 0<t<7 
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100(@¢¢— 17) if -7<1r<0 20. CAS EXPERIMENT. Maximum Output Term. 
Graph and discuss outputs of y" + cy’ + ky = r(t) with 
r(t) as in Example 1 for various c and k with emphasis on 
19. E(t) = 2008 (ar? a t?) (-7 <t< 7) the maximum C),, and its ratio to the second largest ICnl. 


18. £0 -{ 
@ 100(¢ + 47) if 0<t<7 


11.4 Approximation 
by Trigonometric Polynomials 


Fourier series play a prominent role not only in differential equations but also in 
approximation theory, an area that is concerned with approximating functions by 
other functions—usually simpler functions. Here is how Fourier series come into the 
picture. 

Let f(x) be a function on the interval —77 = x S 7 that can be represented on this 
interval by a Fourier series. Then the Nth partial sum of the Fourier series 


N 
(1) f(x) ~ ao + (Gn cos nx + by sin nx) 


n=1 


is an approximation of the given f(x). In (1) we choose an arbitrary N and keep it fixed. 
Then we ask whether (1) is the “best” approximation of f by a trigonometric polynomial 
of the same degree N, that is, by a function of the form 


N 
(2) F(x) = Ap + ey (A, cos nx + B,, sin nx) (N fixed). 


n=1 


Here, “best” means that the “error” of the approximation is as small as possible. 

Of course we must first define what we mean by the error of such an approximation. 
We could choose the maximum of | f(x) — F(x)|. But in connection with Fourier series 
it is better to choose a definition of error that measures the goodness of agreement between 
fand F on the whole interval —7 = x & 77. This is preferable since the sum f of a Fourier 
series may have jumps: F in Fig. 278 is a good overall approximation of f, but the maximum 
of | f(x) — F(x)| (more precisely, the supremum) is large. We choose 


(3) (ee | (f — F) dx. 


=i 


Fig. 278. Error of approximation 
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This is called the square error of F relative to the function fon the interval —7 S x S 77. 
Clearly, E = 0. 

N being fixed, we want to determine the coefficients in (2) such that E is minimum. 
Since (f — F)? = f? — 2fF + F?, we have 


(4) £= | frdx - 2| fF dx + | F? dx. 
We square (2), insert it into the last integral in (4), and evaluate the occurring integrals. 


This gives integrals of cos*nx and sin? nx(n = 1), which equal 7r, and integrals of 
cos nx, sin nx, and (cos nx)(sin mx), which are zero (just as in Sec. 11.1). Thus 


7 7 N 2 
| F? dx | Ao + > (A, cos nx + By sin ms) dx 


_ -—7 n=1 


m(2Ag + AZ +--+ + AN + BE +--+ + BR). 


We now insert (2) into the integral of fF in (4). This gives integrals of fcos nx as well 
as f sin nx, just as in Euler’s formulas, Sec. 11.1, for a, and b, (each multiplied by A,, or 
B,,). Hence 


| fF dx = 77 (2Aodo + Ady apse Anan + Byb, ee Byby). 


-7T 


With these expressions, (4) becomes 


7 N 
E= | fax - 2n| 2bod + > Ain Bybr)| 
( 5) -7 n=1 
N 
+ | 248 +S (AR + BD) 

n=1 
We now take A, = da, and B, = by, in (2). Then in (5) the second line cancels half of the 
integral-free expression in the first line. Hence for this choice of the coefficients of F the 
square error, call it E*, is 


7 N 
(6) jh | fPde- | 2a +S @rt v2) 


7 n=1 


We finally subtract (6) from (5). Then the integrals drop out and we get terms 
A2 — 2A,a, + a2 = (A, — an)” and similar terms (B,, — b,)*: 


N 
E- E*= ar{21Ag =@y + SD An = ay + C= writ. 


n=1 
Since the sum of squares of real numbers on the right cannot be negative, 
E= EF = 0, thus Ez E*, 


and EF = E* if and only if Ag = do,:-:, By = by. This proves the following fundamental 
minimum property of the partial sums of Fourier series. 
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THEOREM 1 Minimum Square Error 


The square error of F in (2) (with fixed N) relative to f on the interval —7T Sx S 7 
is minimum if and only if the coefficients of F in (2) are the Fourier coefficients of f. 
This minimum value E* is given by (6). 


From (6) we see that E* cannot increase as N increases, but may decrease. Hence with 
increasing N the partial sums of the Fourier series of f yield better and better approxi- 
mations to f, considered from the viewpoint of the square error. 

Since E* = 0 and (6) holds for every N, we obtain from (6) the important Bessel’s 
inequality 


7 


(7) 2af + DS a2 + bys = | F(x)? dx 


n=1 - 


for the Fourier coefficients of any function f for which integral on the right exists. (For 
F. W. Bessel see Sec. 5.5.) 


It can be shown (see [C12] in App. 1) that for such a function f, Parseval’s theorem holds; 
that is, formula (7) holds with the equality sign, so that it becomes Parseval’s identity? 


(8) 2a§ + > (az + bY) = = | F(x)? dex. 


n=1 <7 


EXAMPLE 1. Minimum Square Error for the Sawtooth Wave 


Compute the minimum square error E* of F(x) with N = 1, 2,---, 10, 20,---, 100 and 1000 relative to 
f(y) =xt+7 (-7 <x<7) 


on the interval -7 Sx ST. 


io fs Se 
Solution. F(x) = 7 + 2 (sinx i sin 2x + 3 sin 3x fesse of ¥ sin Nx) by Example 3 in 
Sec. 11.3. From this and (6), 7 
7 N 1 
E* (x 4 a)? n= 7 (27? t4 > ) 
-—7 n=1" 
Numeric values are: 
2n 
N Hs N E* N E* N E* 
y? 1 8.1045 6 1.9295 20 0.6129 70 0.1782 
2 4.9629 7 1.6730 30 0.4120 80 0.1561 
- j 1 3 3.5666 8 1.4767 40 0.3103 90 0.1389 
; ; 4 2.7812 9 1.3216 50 0.2488 100 0.1250 
uc hrsaanuaased 5 2.2786 10 1.1959 60 0.2077 1000 0.0126 
N = 20 in Example 1 


3MARC ANTOINE PARSEVAL (1755-1836), French mathematician. A physical interpretation of the identity 
follows in the next section. 
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1 


F = Sj, So, Sz are shown in Fig. 269 in Sec. 11.2, and F = Syo is shown in Fig. 279. Although | f(x) — F(x)| 
is large at +77 (how large?), where fis discontinuous, F approximates f quite well on the whole interval, except 


near +77, where “waves” remain owing to the “Gibbs phenomenon,” which we shall discuss in the next section. 
Can you think of functions f for which E* decreases more quickly with increasing N? 3] 


PROBLEM SET 11.4 


. CAS Problem. Do the numeric and graphic work in 
Example 1 in the text. 


2-5| MINIMUM SQUARE ERROR 


Find the trigonometric polynomial F (x) of the form (2) for 
which the square error with respect to the given f(x) on the 
interval —77 < x < 7 is minimum. Compute the minimum 


value for N = 1, 2,--- , 5 (or also for larger values if you 
have a CAS). 
2. f(y =x (-7T<x< 7) 


3 
4 


un 


10. 


. f(x) = |x| (<7 <x< 7) 
f(x) = x2 (-7 <x< 7) 


-1 if -7w<x<0 
f= 
1 if O0<x<T7 
. Why are the square errors in Prob. 5 substantially larger 
than in Prob. 3? 
f@=x2? (-T<x<7) 
. f(x) = |sinx| (-a <x < 7), full-wave rectifier 


. Monotonicity. Show that the minimum square error 
(6) is a monotone decreasing function of NV. How can 
you use this in practice? 

CAS EXPERIMENT. Size and Decrease of E*. 
Compare the size of the minimum square error E* for 
functions of your choice. Find experimentally the 


factors on which the decrease of E* with N depends. 
For each function considered find the smallest N such 
that E* < 0.1. 


11-15) PARSEVALS’S IDENTITY 


Using (8), prove that the series has the indicated sum. 
Compute the first few partial sums to see that the convergence 
is rapid. 


1 1 2 
IL 1+ 5+ 54+ = = 1.233700550 
3 5 8 
Use Example 1 in Sec. 11.1. 
1 1 if 
12.1+—4+—+4.--=2 = 1,082323234 
ge Be 90 
Use Prob. 14 in Sec. 11.1. 
4 
Gite) ot ea 4 ieee 
a a i 96 
Use Prob. 17 in Sec. 11.1. 
ae 30 
14. cos" x dx = —— 
-—7 4 
ee 51 
15. cos’ x dx = 3 


= 


11.5 Sturm-Liouville Problems. 
Orthogonal Functions 


The idea of the Fourier series was to represent general periodic functions in terms of 
cosines and sines. The latter formed a trigonometric system. This trigonometric system 
has the desirable property of orthogonality which allows us to compute the coefficient of 
the Fourier series by the Euler formulas. 

The question then arises, can this approach be generalized? That is, can we replace the 
trigonometric system of Sec. 11.1 by other orthogonal systems (sets of other orthogonal 
functions)? The answer is “yes” and will lead to generalized Fourier series, including the 
Fourier—Legendre series and the Fourier—Bessel series in Sec. 11.6. 

To prepare for this generalization, we first have to introduce the concept of a Sturm— 
Liouville Problem. (The motivation for this approach will become clear as you read on.) 
Consider a second-order ODE of the form 
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EXAMPLE 1 


(1) [p@)y'l’ + [q(@%) + Ar@ly = 0 


on some interval a = x S J, satisfying conditions of the form 


(a) kiyt+key’ =0 atx=a 


2 
(2) b. 


(b) [yy +t+ley'’ =0 atx 


Here A is a parameter, and ky, ko, 11, /2 are given real constants. Furthermore, at least one 
of each constant in each condition (2) must be different from zero. (We will see in Example 
1 that, ifp(x) = r(x) = 1 and q(x) = 0, then sin V Ax and cos VAx satisfy (1) and constants 
can be found to satisfy (2).) Equation (1) is known as a Sturm-Liouville equation.* 
Together with conditions 2(a), 2(b) it is know as the Sturm-—Liouville problem. It is an 
example of a boundary value problem. 

A boundary value problem consists of an ODE and given boundary conditions 
referring to the two boundary points (endpoints) x = a and x = b of a given interval 
asxBb. 


The goal is to solve these type of problems. To do so, we have to consider 


Eigenvalues, Eigenfunctions 


Clearly, y = 0 is a solution—the “trivial solution” —of the problem (1), (2) for any A 
because (1) is homogeneous and (2) has zeros on the right. This is of no interest. We want 
to find eigenfunctions y (x), that is, solutions of (1) satisfying (2) without being identically 
zero. We call a number A for which an eigenfunction exists an eigenvalue of the Sturm— 
Liouville problem (1), (2). 

Many important ODEs in engineering can be written as Sturm—Liouville equations. The 
following example serves as a case in point. 


Trigonometric Functions as Eigenfunctions. Vibrating String 
Find the eigenvalues and eigenfunctions of the Sturm—Liouville problem 
(3) y" +dAy=0, (0) = 0, y(ar) = 0. 


This problem arises, for instance, if an elastic string (a violin string, for example) is stretched a little and fixed 
at its ends x = 0 and x = 7 and then allowed to vibrate. Then y(x) is the “space function” of the deflection 
u(x, t) of the string, assumed in the form u(x, t) = y(x)w(t), where ¢ is time. (This model will be discussed in 
great detail in Secs, 12.2-12.4.) 


Solution. From (1) nad (2) we see that p=1,q=0,r=1 in (1), and a=0,b=7,k, =], =1, 
ky = lg = 0 in (2). For negative A = -va general solution of the ODE in (3) is y(x) = cye”” + coe”. From 
the boundary conditions we obtain cy; = cg = 0, so that y = 0, which is not an eigenfunction. For A = 0 the 
situation is similar. For positive A = v7 a general solution is 


y(x) = Acos vx + B sin vx. 


‘JACQUES CHARLES FRANCOIS STURM (1803-1855) was born and studied in Switzerland and then 
moved to Paris, where he later became the successor of Poisson in the chair of mechanics at the Sorbonne (the 
University of Paris). 

JOSEPH LIOUVILLE (1809-1882), French mathematician and professor in Paris, contributed to various 
fields in mathematics and is particularly known by his important work in complex analysis (Liouville’s theorem; 
Sec. 14.4), special functions, differential geometry, and number theory. 
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From the first boundary condition we obtain y(0) = A = 0. The second boundary condition then yields 
y(7r) = Bsin v7 = 0, thus v=0,+1,+2,--- 


For v = 0 we have y = 0. For A = v= 1,4, 9, 16, ---, taking B = 1, we obtain 


y(x) = sin vx w= VA =1,2,-:°). 
Hence the eigenvalues of the problem are A = v, where v = 1, 2,---, and corresponding eigenfunctions are 
y(x) = sin vx, wherev = 1,2---. el 


Note that the solution to this problem is precisely the trigonometric system of the Fourier 
series considered earlier. It can be shown that, under rather general conditions on the 
functions p, g, rin (1), the Sturm—Liouville problem (1), (2) has infinitely many eigenvalues. 
The corresponding rather complicated theory can be found in Ref. [All] listed in App. 1. 

Furthermore, if p, g, r, and p’ in (1) are real-valued and continuous on the interval 
a =x = band ris positive throughout that interval (or negative throughout that interval), 
then all the eigenvalues of the Sturm—Liouville problem (1), (2) are real. (Proof in App. 4.) 
This is what the engineer would expect since eigenvalues are often related to frequencies, 
energies, or other physical quantities that must be real. 

The most remarkable and important property of eigenfunctions of Sturm—Liouville 
problems is their orthogonality, which will be crucial in series developments in terms of 
eigenfunctions, as we shall see in the next section. This suggests that we should next 
consider orthogonal functions. 


Orthogonal Functions 


Functions y3(x), yo (x), +: defined on some interval a S x S bare called orthogonal on this 
interval with respect to the weight function r(x) > 0 if for all m and all n different from m, 


b 
(4) Om Yn) = | (x) Vm @) Yn (x) dx = 0 (m # n). 


a 


(Ym; Yn) is a standard notation for this integral. The norm ||y,»|| of ym is defined by 


(5) Ilymll = V Om Ym) = 


b 
| r(x)y2, (x) dx. 


Note that this is the square root of the integral in (4) with n = m. 

The functions yj, yo,-:- are called orthonormal on a = x S 5 if they are orthogonal 
on this interval and all have norm |. Then we can write (4), (5) jointly by using the 
Kronecker symbol? 6,,,,,, namely, 


b 0 if m#n 
Ons Yn)! = | ACM pCO) Gk? = Oa, = { 


a 


1 if m=n. 


®°LEOPOLD KRONECKER (1823-1891). German mathematician at Berlin University, who made important 
contributions to algebra, group theory, and number theory. 
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EXAMPLE 2 


THEOREM 1 


If r(x) = 1, we more briefly call the functions orthogonal instead of orthogonal with 
respect to r(x) = 1; similarly for orthognormality. Then 


b b 
(Ym: Yn) = | Ym@yn@dx=0 (m#n), — |bymll = VOmyn) =, | | yon(x) dx. 


The next example serves as an illustration of the material on orthogonal functions just 
discussed. 


Orthogonal Functions. Orthonormal Functions. Notation 


The functions y,,(x) = sin mx, m = 1, 2,--- form an orthogonal set on the interval —77 S x S 77, because for 
m # n we obtain by integration [see (11) in App. A3.1] 
a 7 


1 1 
(Ym Yn) = | sin mx sin nx dx = Al cos (m — n)x dx — Al cos(m+n)xdx =0, (m #n). 
7 


=—T bow’ 
The norm || ym] = VGrm, Ym) equals V7 because 


7 
lly I? = (Ym Ym) = | sin” mx dx = 7 (m = 1,2,°--) 


-7 


Hence the corresponding orthonormal set, obtained by division by the norm, is 


sin x sin 2x sin 3x Py 


Vir Vi Vi 


Theorem | shows that for any Sturm—Liouville problem, the eigenfunctions associated with 
these problems are orthogonal. This means, in practice, if we can formulate a problem as a 
Sturm-Liouville problem, then by this theorem we are guaranteed orthogonality. 


Orthogonality of Eigenfunctions of Sturm—Liouville Problems 


Suppose that the functions p, q, r, and p’ in the Sturm—Liouville equation (1) are 
real-valued and continuous and r(x) > 0 on the interval a S x S&S b. Let yy, (x) and 
Yn (x) be eigenfunctions of the Sturm—Liouville problem (1), (2) that correspond to 
different eigenvalues A, and A,,, respectively. Then Ym, Yy are orthogonal on that 
interval with respect to the weight function r, that is, 


b 
(6) (Ym Yn) = | r(X)¥m @)Yn (x) dx = 0 (m # n). 


a 


If p(a) = 0, then (2a) can be dropped from the problem. If p(b) = 0, then (2b) 
can be dropped. [It is then required that y and y’ remain bounded at such a point, 
and the problem is called singular, as opposed to a regular problem in which (2) 
is used. ] 

If p(a) = p(b), then (2) can be replaced by the “periodic boundary conditions” 


(7) y(a) = y(b), —-y'(a) = y'(b). 


The boundary value problem consisting of the Sturm—Liouville equation (1) and the periodic 
boundary conditions (7) is called a periodic Sturm-—Liouville problem. 
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By assumption, y,, and y,, satisfy the Sturm—Liouville equations 


(Pym) + (+ Amr Ym = 0 
(Pyn) + (q+ Anryn = 0 


respectively. We multiply the first equation by y,, the second by —y,,, and add, 


Am — An)mYn = Ym(P¥n)’ — Yn(P¥m)’ = ((P¥n)¥m — [(P¥m)ynl’ 


where the last equality can be readily verified by performing the indicated differentiation 
of the last expression in brackets. This expression is continuous on a = x S b since p and 
p’ are continuous by assumption and y,,, Yn are solutions of (1). Integrating over x from 
a to b, we thus obtain 


b 
(8) Am — | r¥mYn Ax = [PLYnYm — YmIadle (a < b). 


The expression on the right equals the sum of the subsequent Lines | and 2, 


(9) P(b)Lyn(D)Ym(b) — ym(b) ynlb)] (Line 1) 
—pP@lyn@ym(@) — Ym(@yn(a)] (Line 2). 


Hence if (9) is zero, (8) with A, — A, # 0 implies the orthogonality (6). Accordingly, 
we have to show that (9) is zero, using the boundary conditions (2) as needed. 


Case 1. p(a) = p(b) = 0. Clearly, (9) is zero, and (2) is not needed. 

Case 2. p(a) # 0,p(b) = 0. Line | of (9) is zero. Consider Line 2. From (2a) we have 
kiyn(a) + keyn(a) = 0, 
kiym(a) + keym(a) = 0. 


Let kg # 0. We multiply the first equation by y,, (a), the last by —y,,(a) and add, 


kelyn(@)Ym(a) — Ym(a)yn(a)] = 0. 


This is ky times Line 2 of (9), which thus is zero since kg # 0. If ko = 0, then ky # 0 
by assumption, and the argument of proof is similar. 


Case 3. p(a) = 0, p(b) # 0. Line 2 of (9) is zero. From (2b) it follows that Line 1 of (9) 
is zero; this is similar to Case 2. 

Case 4. p(a) # 0, p(b) # 0. We use both (2a) and (2b) and proceed as in Cases 2 and 3. 
Case 5. p(a) = p(b). Then (9) becomes 


PL ynO)ym(b) — ym(bynlb) — Yn @ym@ + Yn(MynO)- 


The expression in brackets [--- ] is zero, either by (2) used as before, or more directly by 
(7). Hence in this case, (7) can be used instead of (2), as claimed. This completes the 
proof of Theorem 1. a 


Application of Theorem 1. Vibrating String 


The ODE in Example | is a Sturm—Liouville equation with p = 1, g = 0, andr = 1. From Theorem | it follows 
that the eigenfunctions y,, = sin mx (m = 1, 2,---) are orthogonal on the interval 0 = x S 7. 
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EXAMPLE 4 
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Example 3 confirms, from this new perspective, that the trigonometric system underlying 
the Fourier series is orthogonal, as we knew from Sec. 11.1. 


Application of Theorem 1. Orthogonlity of the Legendre Polynomials 


2xy’ 4 


Legendre’s equation (1 x?) y" n(n + 1)y = 0 may be written 


[dd — x)y’]! + Ay =0 A=n(n+ 1. 


Hence, this is a Sturm-—Liouville equation (1) with p = 1 — x*, q = 0, and r = 1. Since p(—1) = p(1) = 0, we 
need no boundary conditions, but have a “singular” Sturm-Liouville problem on the interval —1 = x = 1. We 
know that for n = 0, 1,---, hence A = 0,1 - 2,2 -3,---, the Legendre polynomials P, (x) are solutions of the 
problem. Hence these are the eigenfunctions. From Theorem | it follows that they are orthogonal on that interval, 
that is, 


1 
(10) | Pry (x)Py (x) dx = 0 (m#n). 


-1 


What we have seen is that the trigonometric system, underlying the Fourier series, is 
a solution to a Sturm—Liouville problem, as shown in Example 1, and that this 
trigonometric system is orthogonal, which we knew from Sec. 11.1 and confirmed in 


Example 3. 


PROBLEEM—SET 11-5 


1, 


2-6 
2. 


6. 


Proof of Theorem 1. Carry out the details in Cases 3 
and 4. 


ORTHOGONALITY 


Normalization of eigenfunctions y,,, of (1), (2) means 
that we multiply y,, by a nonzero constant c,, such that 
CmYm has norm 1. Show that Zz, = cym with anyc # 0 
is an eigenfunction for the eigenvalue corresponding 
tO Ym- 


. Change of x. Using Prob. 3, derive the orthogonality 


P,(cos 0), n = 0, 1,---, from an orthogonal set on the 
interval 0 = 6 S 7 with respect to the weight function 
sin 0. 

Tranformation to Sturm-Liouville form. Show that 
y”" + fy’ + (g + Ah) y = 0 takes the form (1) if you 


set p = exp(ffdx), g = pg, r = hp. Why would you 
do such a transformation? 


7-15 


STURM-LIOUVILLE PROBLEMS 


Find the eigenvalues and eigenfunctions. Verify orthogo- 
nality. Start by writing the ODE in the form (1), using 
Prob. 6. Show details of your work. 


, 7. y" + d4y=0, y)=0, y(10)=0 
. Change of x. Show that if the functions yg (x), 1 (x), ° °° . 
form an orthogonal set on an interval a S x S b (with 8 y +Ay=0, yO) =0, yL)=0 
r(x) = 1), then the functions yo(ct + k), y1(ct + hk), 9 y" +drAy=0, y0)=0, y(L)=0 
--+,c >0, form an orthogonal set on the interval 10. y" +Ay =0, y(0)=y(1), y(0) = yA) 
(a-—kh/oStsb—hk/c. Whar 7 7 
11. (y /x) + (A + Dy/x” = 0, yy) =0, yle”) = 0. 


: : (Set x = e°.) 
of 1, cos 7x, sin 7x, cos 27x, sin 277rx, on ‘ : 
—1 =x 21 (r@ = 1) from that of 1, cos x, sin x, 12. y 2y +At ly=0, yO)=0, yd)=0 
cos 2x, sin 2x, ++: on -7TWSxS7. 13. y” + 8y’ +(A + 16)y =0, y(0)=0, y(m) =0 
. Legendre polynomials. Show that the functions 44, TEAM PROJECT. Special Functions. Orthogonal 


polynomials play a great role in applications. For 
this reason, Legendre polynomials and various other 
orthogonal polynomials have been studied extensively; 
see Refs. [GenRef1], [GenRef10] in App. 1. Consider 
some of the most important ones as follows. 


504 CHAP. 11_ Fourier Analysis 


(a) Chebyshev polynomials® of the first and second that 7,,(x), n = 0,1,2,3, satisfy the Chebyshev 
kind are defined by equation 


m4 


el x2)y" xy + n7y = 0. 


T, (x) = cos (n arccos x) 


sin [(n + 1) arccos x] (b) Orthogonality on an infinite interval: Laguerre 
Un (x) = ; 5 polynomials’ are defined by Lo = 1, and 
a, 4 
ed" (x"e~*) 
respectively, where n = 0, 1,---. Show that LyX) = a a n=1,2,:-- 
Ty = 1, Tx) = x, T(x) = 2x7 — 1. Show that 
B(x) = 4x3 — 3x, 


L(x) = 1-4, L(x) = 1 — 2x + x?/2, 


Up = 1, U(x) = 2x, U(x) = 4x7 - 1, ‘ 
L3(x) = 1 — 3x + 3x*/2 — x°/6. 


Us(x) = 8x3 — 4x. 


Prove that the Laguerre polynomials are orthogonal on 


Show that the Chebyshev polynomials 7,,(x) are the positive axis 0 S x < © with respect to the weight 
orthogonal on the interval —1 = x = 1 with respect function r(x) = e~*. Hint. Since the highest power in 
to the weight function r(x) = 1/V1 — x?. (Hint. Lm is x™, it suffices to show that fe~*x*L,, dx = 0 

To evaluate the integral, set arccos x = 6.) Verify for k <n. Do this by k integrations by parts. 


11.6 Orthogonal Series. 
Generalized Fourier Series 


Fourier series are made up of the trigonometric system (Sec. 11.1), which is orthogonal, 
and orthogonality was essential in obtaining the Euler formulas for the Fourier coefficients. 
Orthogonality will also give us coefficient formulas for the desired generalized Fourier 
series, including the Fourier-Legendre series and the Fourier—Bessel series. This gener- 
alization is as follows. 

Let yo, y1, y2,°** be orthogonal with respect to a weight function r(x) on an interval 
a =x Sb, and let f(x) be a function that can be represented by a convergent series 


(1) fo) = >) amym@ = aoyo(x) + airyi@@) + + 


m=0 


This is called an orthogonal series, orthogonal expansion, or generalized Fourier series. 
If the y,,, are the eigenfunctions of a Sturm—Liouville problem, we call (1) an eigenfunction 
expansion. In (1) we use again m for summation since n will be used as a fixed order of 
Bessel functions. 

Given f(x), we have to determine the coefficients in (1), called the Fourier constants 
of f(x) with respect to yo, y1,°**. Because of the orthogonality, this is simple. Similarly 
to Sec. 11.1, we multiply both sides of (1) by r(x)y, (x) (n fixed) and then integrate on 


SPAFNUTI CHEBYSHEV (1821-1894), Russian mathematician, is known for his work in approximation 
theory and the theory of numbers. Another transliteration of the name is TCHEBICHEF. 

7EDMOND LAGUERRE (1834-1886), French mathematician, who did research work in geometry and in 
the theory of infinite series. 
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both sides from a to b. We assume that term-by-term integration is permissible. (This is 
justified, for instance, in the case of “uniform convergence,” as is shown in Sec. 15.5.) 
Then we obtain 


b 


b b = 20 
(f.¥n) = | rf¥n dx = | r( = and ) nd => am| rym Yn de = > am Om Yn)- 
m=0 


a a m=0 a m=0 


Because of the orthogonality all the integrals on the right are zero, except when m = n. 
Hence the whole infinite series reduces to the single term 


Gnl¥ns Yn) = Gala? Thos (hyn) = aullyall?. 


Assuming that all the functions y,, have nonzero norm, we can divide by || y,J|?; writing again 
m for n, to be in agreement with (1), we get the desired formula for the Fourier constants 


Cf, Ym) il 
ae Iya? = Iv? | r(x) f(X)ym (x) dx (n = 0, 1,-"°). 
Ym Ym a 


(2) am 


This formula generalizes the Euler formulas (6) in Sec. 11.1 as well as the principle of 
their derivation, namely, by orthogonality. 


EXAMPLE 1_ Fourier—Legendre Series 


A Fourier—Legendre series is an eigenfunction expansion 


Nik 
VY 


f@) = > amPn) doPo + ayPy (x) + doPo(x) + +++ = ag + ayx 4 ay (3 x” 
m=0 


in terms of Legendre polynomials (Sec. 5.3). The latter are the eigenfunctions of the Sturm—Liouville problem 
in Example 4 of Sec. 11.5 on the interval —1 = x S 1. We have r(x) = 1 for Legendre’s equation, and (2) 
gives 


2m+ 1 


1 
| F@)Pm(@) dx, m=0,1,-°: 
=il 


(3) adm = 


because the norm is 


(4) Pall = ore 2 ee es 
ie ae m+ 1 or 


as we state without proof. The proof of (4) is tricky; it uses Rodrigues’s formula in Problem Set 5.2 and a 
reduction of the resulting integral to a quotient of gamma functions. 
For instance, let f(x) = sin 77x. Then we obtain the coefficients 


1 1 
3 3 
| (sin 77x)Pyy (x) dx, thus ay = . | x sin 7x dx = = 0.95493, etc. 
-1 =1 


2m + 1 


an = 
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Hence the Fourier-Legendre series of sin 77x is 


sin 7x = 0.95493P, (x) — 1.15824P3 (x) + 0.21929P5 (x) — 0.01664P) (x) + 0.000687 (x) 
— 0.00002P,4 (x) +++. 


The coefficient of P,3 is about 3 - 107’. The sum of the first three nonzero terms gives a curve that practically 
coincides with the sine curve. Can you see why the even-numbered coefficients are zero? Why ag is the absolutely 
biggest coefficient? fs] 


Fourier—Bessel Series 


These series model vibrating membranes (Sec. 12.9) and other physical systems of circular symmetry. We derive 
these series in three steps. 


Step 1. Bessel’s equation as a Sturm-—Liouville equation. The Bessel function J, (x) with fixed integer n 2 0 
satisfies Bessel’s equation (Sec. 5.5) 


F27 (©) + Fy, (©) + (F? — n)J,(¥) = 0 


where J, = dJ,,/dx and Ia = a? Jp/ dx”. We set x = kx. Then x = x/k and by the chain rule, Jn = dJy,/dx = 
(dJn/adxy/k and Jn, = Jn/ k?. In the first two terms of Bessel’s equation, k? and k drop out and we obtain 


x2," (kx) + xd, (ke) + (k2x? — nn) (kx) = 0. 


Dividing by x and using (xJ;,(kx))’ = xJ?, (kx) + J, (kx) gives the Sturm-Liouville equation 


2 
(5) Lie)’ + (-= ts ax) Lt) = 0 =k 


with p(x) = x, q(x) = —n?/x, r(x) =x, and parameter A = k?. Since p(0) = 0, Theorem 1 in Sec. 11.5 
implies orthogonality on an interval 0 = x = R (R given, fixed) of those solutions J,(kx) that are zero at 
x = R, that is, 


(6) J(kR) = 0 (n fixed). 


Note that g(x) = —n?/x is discontinuous at 0, but this does not affect the proof of Theorem 1. 
Step 2. Orthogonality. It can be shown (see Ref. [A13]) that J,(x) has infinitely many zeros, say, 
X = An < dng < -*- (see Fig. 110 in Sec. 5.4 for n = 0 and 1). Hence we must have 


(7) kKR = Quam thus knm = ,m/R (m = 1,2,-+-). 


This proves the following orthogonality property. 


Orthogonality of Bessel Functions 


For each fixed nonnegative integer n the sequence of Bessel functions of the first 
kind Jn(kn1X), In(kn 2X), +** with ky m as in (7) forms an orthogonal set on the 
interval 0 S x S R with respect to the weight function r(x) = x, that is, 


R 
(8) | Sy (kn mW nlkn, 7x) dx = 0 (j # m, n fixed). 
0 


Hence we have obtained infinitely many orthogonal sets of Bessel functions, one for each of Jo, Jy, Ja, ++ * 
Each set is orthogonal on an interval 0 S x S R with a fixed positive R of our choice and with respect to 
the weight x. The orthogonal set for Jp is Jn(kn 1X), Inlkn,2X), InlkngXx),***, where n is fixed and ky m is 
given by (7). 
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EXAMPLE 3 


Step 3. Fourier—Bessel series. The Fourier—Bessel series corresponding to J, (n fixed) is 


(9) F(x) = > Amd nlknym*) = ay ylk yx) te dad nk n,2X) a 3d nk n,3%) aoe (n fixed). 


m=1 


The coefficients are (with Qm = kn mR) 


) R 
(10) C= ts nS Xf (Xx) Inl(kinmx) dx, m = 1,2,::: 
R Jn+1(Qn,m) 0 


because the square of the norm is 

R 2 
2 2 R 2 
ad 1) IJK nm) | = Jn (KnymX) dx = 2 JinsikknmR) 
0 


as we state without proof (which is tricky; see the discussion beginning on p. 576 of [A13]). B 


Special Fourier—Bessel Series 


For instance, let us consider f(x) = 1 — x® and take R = 1 and n = 0 in the series (9), simply writing A for 
Qo,m- Then kym = G0,m = A = 2.405, 5.520, 8.654, 11.792, etc. (use a CAS or Table Al in App. 5). Next we 
calculate the coefficients a,, by (10) 


an ig . 
am = x(1 — x*)\Jo(Ax) dx. 
ITA) vo 


This can be integrated by a CAS or by formulas as follows. First use [xJ,(Ax)]/ = AxJo(Ax) from Theorem 1| in 
Sec. 5.4 and then integration by parts, 


1 


1 
1 
= >| XJ (Ax)(—2x) dx |. 
o (A 0 


1 
| x1 — x?)Jo(Ax) dx = 0 = x?)xJy(Ax) 
0 


i= 
Jz) JZ(A) 


The integral-free part is zero. The remaining integral can be evaluated by [x7Jo(Ax)]/ = Ax7J4(Ax) from Theorem 1 
in Sec. 5.4. This gives 


4Jo(A) 
am = 


naar ee (A = Q0,m)- 
772 (A) 


Numeric values can be obtained from a CAS (or from the table on p. 409 of Ref. [GenRef1] in App. 1, together 
with the formula J) = 2x ‘J, — Jo in Theorem 1 of Sec. 5.4). This gives the eigenfunction expansion of | — x? 
in terms of Bessel functions Jo, that is, 


1 — x? = 1.1081 Jo(2.405x) — 0.1398J(5.520x) + 0.0455J9(8.654x) — 0.0210J9(11.792x) + ++. 


A graph would show that the curve of 1 — x? and that of the sum of first three terms practically coincide. 


Mean Square Convergence. Completeness 


Ideas on approximation in the last section generalize from Fourier series to orthogonal series 
(1) that are made up of an orthonormal set that is “complete,” that is, consists of “sufficiently 
many” functions so that (1) can represent large classes of other functions (definition below). 
In this connection, convergence is convergence in the norm, also called mean-square 
convergence; that is, a sequence of functions fj, is called convergent with the limit f if 


(12%) jim || f, — fll = 0; 
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written out by (5) in Sec. 11.5 (where we can drop the square root, as this does not affect 
the limit) 


b 
(12) tim | r@L fc) — f@)P dx = 0. 


a 


Accordingly, the series (1) converges and represents /f if 


b 
(13) sim | r@x)lsn (x) — fa)? dx = 0 


where s;, is the kth partial sum of (1). 


k 
(14) sex) = >) am¥m@). 
m=0 
Note that the integral in (13) generalizes (3) in Sec. 11.4. 
We now define completeness. An orthonormal set yo, yj, --: on anintervala =x =b 
is complete in a set of functions S defined on a = x S b if we can approximate every 
f belonging to S arbitrarily closely in the norm by a linear combination dgyo + 


ayy, +++: + axyx, that is, technically, if for every € > 0 we can find constants ao, °--, a 
(with k large enough) such that 
(15) If — Goyo + +++ + anya] < € 


Ref. [GenRef7] in App. | uses the more modern term total for complete. 

We can now extend the ideas in Sec. 11.4 that guided us from (3) in Sec. 11.4 to Bessel’s 
and Parseval’s formulas (7) and (8) in that section. Performing the square in (13) and 
using (14), we first have (analog of (4) in Sec. 11.4) 


b 


b b b 
| reoxe — f(x)P dx = | rsz dx — 2| rfsj,dx + | rf? dx 


a a a a 


b 


b k 2 k b 
| | an a= 2 “| rfYm, dx + | rf? dx. 
m=0 m=0 a a 


a 


The first integral on the right equals Dda?, because S1myidx = 0 for m # 1, and 
fry2, dx = 1. In the second sum on the right, the integral equals ay, by (2) with || yn |? = 1. 
Hence the first term on the right cancels half of the second term, so that the right side 
reduces to (analog of (6) in Sec. 11.4) 


k b 
= by a+ | rf? dx. 
m=0 a 


This is nonnegative because in the previous formula the integrand on the left is nonnegative 
(recall that the weight r(x) is positive!) and so is the integral on the left. This proves the 
important Bessel’s inequality (analog of (7) in Sec. 11.4) 


k b 
(16) Se =|HrS | r(x) fy? dx (k = 1,2,+++), 


a 


3 
t 
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THEOREM 2 


PROOF 


Here we can let k—>~™, because the left sides form a monotone increasing sequence that 
is bounded by the right side, so that we have convergence by the familiar Theorem 1 in 
App. A.3.3 Hence 


(17) > ener. 
m=0 
Furthermore, if yo, y1,°** is complete in a set of functions S, then (13) holds for every f 


belonging to S. By (13) this implies equality in (16) with k—™. Hence in the case of 
completeness every fin S saisfies the so-called Parseval equality (analog of (8) in Sec. 11.4) 


fo} b 
(18) > an= lif? = | r(x) f(a)? de. 


m=0 a 


As a consequence of (18) we prove that in the case of completeness there is no function 
orthogonal to every function of the orthonormal set, with the trivial exception of a function 
of zero norm: 


Completeness 


Let yo, y1,:'* be a complete orthonormal set on a = x S b ina set of functions S. 
Then if a function f belongs to S and is orthogonal to every ym, it must have norm 
zero. In particular, if f is continuous, then f must be identically zero. 


Since f is orthogonal to every ym, the left side of (18) must be zero. If f is continuous, 
then || f|| = 0 implies f(x) = 0, as can be seen directly from (5) in Sec. 11.5 with f instead 
of y», because r(x) > 0 by assumption. |_| 


PROBLEM SET T1-.6 


1-7| FOURIER-LEGENDRE SERIES CAS EXPERIMENT 
Showing the details, develop FOURIER-LEGENDRE SERIES. Find and graph (on 
1. 63x° — 90x? + 35x common axes) the partial sums up to S,,,, whose graph 
2. (x + 12 practically coincides with that of f(x) within graphical 
4 accuracy. State mg. On what does the size of mpg seem to 
3.1 -x depend? 
4.1, x, x7, x? x4 : 
. . . : : 8. f(x) = sin 7x 
5. Prove that if f(x) is even (is odd, respectively), its ; 
Fourier—Legendre series contains only Py, (x) with even 9. f(x) = sin 27x 
m (only Py (x) with odd m, respectively). Give examples. 10. f(x) = er 
6. What can you say about the coefficients of the Fourier— 1. f(x) = $x)? 
Legendre series of f(x) if the Maclaurin series of f(x) . 
contains only powers am (m = 0,1, 2,-+-)? 12. f(x) = Jo(a01%), 0,1 = the first positive zero 
of Jo(x) 


7. What happens to the Fourier-Legendre series of a 


polynomial f(x) if you change a coefficient of f(x)? 13. f(x) = Jo(a0,2*), 0,2 = the second positive zero 
Experiment. Try to prove your answer. of Jo(x) 
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TEAM PROJECT. Orthogonality on the Entire Real 
Axis. Hermite Polynomials.® These orthogonal polyno- 
mials are defined by Heg(1) = 1 and 


d” 
(19) Hen (x) = (-1)"e" 2? —(e*"),_ an = 1,2, -. 
dx" 
REMARK. As is true for many special functions, the 
literature contains more than one notation, and one some- 
times defines as Hermite polynomials the functions 


d"e~ 


Hé=1, HA@ = (-D"e™ 


er” 


This differs from our definition, which is preferred in 
applications. 


(a) Small Values of n. Show that 
Hep(x) = x2 1; 
Heg(x) = x* — 6x2 + 3. 


He,(x) = x, 
He3(x) = x3 — 3x, 


(b) Generating Function. A generating function of the 
Hermite polynomials is 


(20) oP = Saar 
n=0 


because He,,(x) = n! a,(x). Prove this. Hint: Use the 
formula for the coefficients of a Maclaurin series and 
note that tx 3/7 = 3x? 4x 1). 

(c) Derivative. Differentiating the generating func- 
tion with respect to x, show that 

(21) He,, (x) = nHe,,— 1 (x). 


(d) Orthogonality on the x-Axis needs a weight function 
that goes to zero sufficiently fast as x > +, (Why?) 


11.7 Fourier Integral 


15. 


Show that the Hermite polynomials are orthogonal on 
ee with respect to the weight function 
r(x) =e” /” Hint. Use integration by parts and (21). 
(e) ODEs. Show that 


(22) He},(x) = xHe,(x) — Hey +1 (x). 


Using this with n — 1 instead of n and (21), show that 
y = He,(x) satisfies the ODE 


(23) y" =xy’ + ny =0. 


a?/ 


Show that w =e” 4) is a solution of Weber’s 


equation 


(24) w” + (n+4—-—4x7)w =0 


CAS EXPERIMENT. Fourier—Bessel Series. Use 
Example 2 and R = 1, so that you get the series 


(25) f(x) = ayJo(@0,1x) + a2Jo(a0,2x) 


+ agzJo(Qo,3X) + °°" 


With the zeros a1 ,2,°** from your CAS (see also 
Table Al in App. 5). 

(a) Graph the terms Jo(@01x),"--,Jo(@o,10x) for 
0 =x = 1 on common axes. 

(b) Write a program for calculating partial sums of (25). 
Find out for what f(x) your CAS can evaluate the 
integrals. Take two such f(x) and comment empirically 
on the speed of convergence by observing the decrease 
of the coefficients. 

(c) Take f(x) = 1 in (25) and evaluate the integrals 
for the coefficients analytically by (21a), Sec. 5.4, with 
v = 1. Graph the first few partial sums on common 
axes. 


Fourier series are powerful tools for problems involving functions that are periodic or are of 
interest on a finite interval only. Sections 11.2 and 11.3 first illustrated this, and various further 
applications follow in Chap. 12. Since, of course, many problems involve functions that are 
nonperiodic and are of interest on the whole x-axis, we ask what can be done to extend the 
method of Fourier series to such functions. This idea will lead to “Fourier integrals.” 

In Example | we start from a special function f,, of period 2Z and see what happens to 
its Fourier series if we let L—> ©. Then we do the same for an arbitrary function fr of 
period 2L. This will motivate and suggest the main result of this section, which is an 
integral representation given in Theorem | below. 


S8CHARLES HERMITE (1822-1901), French mathematician, is known for his work in algebra and number 
theory. The great HENRI POINCARE (1854-1912) was one of his students. 
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EXAMPLE-1 


Rectangular Wave 


Consider the periodic rectangular wave f(x) of period 2L > 2 given by 


0. 1 =—Lx%< =1 
fL@ = §1 if -l<x< 1 
0. if i Ste oie’ 


The left part of Fig. 280 shows this function for 2L = 4, 8, 16 as well as the nonperiodic function f(x), which 
we obtain from fy, if we let L— ~, 


{’ if-l<x<1l 
x) = lim f(x) = 
A jim ful , 0 otherwise. 
We now explore what happens to the Fourier coefficients of f,, as L increases. Since fy, is even, b, = O for 
all n. For a, the Euler formulas (6), Sec. 11.2, give 


1 f 1 1 f nTrx 2 i nTrX 2 sin (n7r/L) 
do cos dx : 
= 0 


dx : an b 
2L J-1 L L L Lnaw/L 
This sequence of Fourier coefficients is called the amplitude spectrum of f;, because |a,,| is the maximum 
amplitude of the wave a, cos (n7rx/L). Figure 280 shows this spectrum for the periods 2L = 4, 8, 16. We see 
that for increasing L these amplitudes become more and more dense on the positive w,,-axis, where w, = n7/L. 
Indeed, for 2L = 4, 8, 16 we have 1, 3, 7 amplitudes per “half-wave” of the function (2 sin w,)/(Lw,,) (dashed 
in the figure). Hence for 2L = 2* we have 2*-1 — 1 amplitudes per half-wave, so that these amplitudes will 
eventually be everywhere dense on the positive w,,-axis (and will decrease to zero). 

The outcome of this example gives an intuitive impression of what about to expect if we turn from our special 
function to an arbitrary one, as we shall do next. Bo 


Waveform f, (x) Amplitude spectrum a, (w, ) 


n=1 w,=nnlL 


2 8] 2 x SY é, Ww, 
1 
calla n=2 
f,(~) \ n=10 
= = ‘ pots 
4 0 4 x i al “7 
<—2r =8—>| n=6 n=14 
n=4 
a oT = 
-8 0 8 x = “4 w, 


n=12 n= 28 


K-21, = 16> 


f(x) 


-101 x 


Fig. 280. Waveforms and amplitude spectra in Example 1 
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From Fourier Series to Fourier Integral 


We now consider any periodic function f(x) of period 2 that can be represented by a 
Fourier series 


fi) = ag + > (ay. cos Wyx + by sin yx), Wn => 
n=1 
and find out what happens if we let L—~. Together with Example 1 the present 
calculation will suggest that we should expect an integral (instead of a series) involving 
cos wx and sin wx with w no longer restricted to integer multiples w = wy, = n7/L 
of 7/L but taking all values. We shall also see what form such an integral might 
have. 
If we insert a, and b,, from the Euler formulas (6), Sec. 11.2, and denote the variable 
of integration by v, the Fourier series of f;,(x) becomes 


L 
1 jee 
SL) = | fi) dv + =p) 


L 
co WX | FLW) cos wyv dv 
-L n=1 -L 


L 
+ sin vn | FL(v) sin Wyv dv |. 


We now set 


1t+DT nt oa 
i L ~ de 


Aw = Wn+1 — Wn = 


Then 1/L = Aw/77, and we may write the Fourier series in the form 


oo L 


L 
(1) fio = = | io dv + => (os Wx) Aw [ fi) cos wyv dv 


n=1 
i, 
+ (sin wmv | SL) sin wyv a 


Si 


This representation is valid for any fixed L, arbitrarily large, but finite. 
We now let L— © and assume that the resulting nonperiodic function 
fx) = him fz,00) 
is absolutely integrable on the x-axis; that is, the following (finite!) limits exist: 


0 b cd 
(2) lim | | f(x)| dx + lim | | f(x) | dx (oriten lf(~)| a) 
es a ci 0 


Then 1/L—0, and the value of the first term on the right side of (1) approaches zero. 
Also Aw = 77/L—0 and it seems plausible that the infinite series in (1) becomes an 
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EXAMPLE 2 


integral from 0 to ~, which represents f(x), namely, 


oo 


(3) f@ = z| co wx f(v) cos wu dv + sin wx F(v) sin wu av Jaw 


0 00 —00 
If we introduce the notations 


C-) 


(4) A(w) = =| f(v) cos wu dv, Bw) = =| F(v) sin wu du 


—2 —-7 


we can write this in the form 


(5) FEDS | [A(w) cos wx + B(w) sin wx] dw. 
0 


This is called a representation of f(x) by a Fourier integral. 

It is clear that our naive approach merely suggests the representation (5), but by no 
means establishes it; in fact, the limit of the series in (1) as Aw approaches zero is not 
the definition of the integral (3). Sufficient conditions for the validity of (5) are as follows. 


Fourier Integral 


If f(x) is piecewise continuous (see Sec. 6.1) in every finite interval and has a right- 
hand derivative and a left-hand derivative at every point (see Sec 11.1) and if the 
integral (2) exists, then f(x) can be represented by a Fourier integral (5) with A and 
B given by (4). At a point where f(x) is discontinuous the value of the Fourier integral 
equals the average of the left- and right-hand limits of f(x) at that point (see Sec. 11.1). 
(Proof in Ref. [C12]; see App. 1.) 


Applications of Fourier Integrals 


The main application of Fourier integrals is in solving ODEs and PDEs, as we shall see 
for PDEs in Sec. 12.6. However, we can also use Fourier integrals in integration and in 
discussing functions defined by integrals, as the next example. 


Single Pulse, Sine Integral. Dirichlet’s Discontinuous Factor. Gibbs Phenomenon 


Find the Fourier integral representation of the function 


1 
ie Fig. 281 
ae is if xl >1 eee 


Fig. 281. Example 2 
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Solution. From (4) we obtain 


co 1 : 1 : 
1 sin wv 2 sin w 
A(w) == | f(v) cos wu dv = cos wu du = 
7 Tw ; Tw 
= =i = 


1 
1 
Bw) = 7 | sin wv du = 0 
-1 


and (5) gives the answer 


0 : 
COS WX SIN W 


: 2 
(6) fa) = = | — dw, 


w 
0 


The average of the left- and right-hand limits of f(x) at x = 1 is equal to (1 + 0)/2, that is, 3. 
Furthermore, from (6) and Theorem 1 we obtain (multiply by 7/2) 


m/2 if O=x<1, 


(7) dw = 47/4 if x=1, 


Fe 
| cos wx sin w 
0 Ww 


0 if x>1. 


We mention that this integral is called Dirichlet’s discontinous factor. (For P. L. Dirichlet see Sec. 10.8.) 
The case x = 0 is of particular interest. If x = 0, then (7) gives 


(8*) [ 2 aw=% 


We see that this integral is the limit of the so-called sine integral 


tae! 
sin w 


(8) Siu) = | ae 


0 


as u-—>®, The graphs of Si(u) and of the integrand are shown in Fig. 282. 

In the case of a Fourier series the graphs of the partial sums are approximation curves of the curve of the 
periodic function represented by the series. Similarly, in the case of the Fourier integral (5), approximations are 
obtained by replacing © by numbers a. Hence the integral 


7 


a ft 
2 cos wx sin w 
(9) | = dw 
w 
0 


approximates the right side in (6) and therefore f(x). 


Integrand 
: ' 
ee tet Ted ‘I a _ 
-4n~ 3x -2n~--In 0 ln---2n 3n” Anu 
-0.5- 
1b 


Fig. 282. Sine integral Si(u) and integrand 
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a=8 a=16 a=32 


2x oy 0 T° 2% 


YN ti n ™ 1 Ll 
aoa, 0) a¥ Dx 2 ‘10! 1 
Fig. 283. The integral (9) for a = 8, 16, and 32, illustrating 
the development of the Gibbs phenomenon 


Figure 283 shows oscillations near the points of discontinuity of f(x). We might expect that these oscillations 
disappear as a approaches infinity. But this is not true; with increasing a, they are shifted closer to the points 
x = +1. This unexpected behavior, which also occurs in connection with Fourier series (see Sec. 11.2), is known 
as the Gibbs phenomenon. We can explain it by representing (9) in terms of sine integrals as follows. Using 
(11) in App. A3.1, we have 


dw. 


Wt 


n 
7 Ww T Ww 7 


2 (° cos wx sin w 1 (° sin (w + wx) 1 (% sin (w — wx) 
dw d 
w 
0 0 0 


In the first integral on the right we set w + wx = ¢t. Then dw/w = dt/t, and 0 Sw <a corresponds to 


0 =t=(« + la. In the last integral we set w — wx = —1. Then dw/w = dt/t, and 0 S w S a corresponds to 
0 StS(x — Ia. Since sin (—t) = —sin t, we thus obtain 
2 (“cos wx sin w 1 (°* sine 1 (°°? sin t 
dw dt dt. 
T w t 7 t 
0 0 0 


From this and (8) we see that our integral (9) equals 


~ sicalx + ]]) ~ sicatx 1}) 


and the oscillations in Fig. 283 result from those in Fig. 282. The increase of a amounts to a transformation 
of the scale on the axis and causes the shift of the oscillations (the waves) toward the points of discontinuity 
—1 and 1. | 


Fourier Cosine Integral and Fourier Sine Integral 


Just as Fourier series simplify if a function is even or odd (see Sec. 11.2), so do Fourier 
integrals, and you can save work. Indeed, if f has a Fourier integral representation and is 
even, then B(w) = 0 in (4). This holds because the integrand of B(w) is odd. Then (5) 
reduces to a Fourier cosine integral 


(oo) 


a0) fW= | A(w) cos wx dw where A(w) = 2 | Fv) cos wu dv. 
0 0 


Note the change in A(w): for even f the integrand is even, hence the integral from —© to 
co equals twice the integral from 0 to ©, just as in (7a) of Sec. 11.2. 

Similarly, if fhas a Fourier integral representation and is odd, then A(w) = 0 in (4). This 
is true because the integrand of A (w) is odd. Then (5) becomes a Fourier sine integral 


(oo) 


ay fwe= | B(w) sin wx dw where B(w) = =| Ff(v) sin wu dv. 
0 0 


oo 


5 


0 


16 


EXAMPLE 3 


Fig. 284. f(x) 
in Example 3 
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Note the change of B(w) to an integral from 0 to © because B(w) is even (odd times odd 
is even). 

Earlier in this section we pointed out that the main application of the Fourier integral 
representation is in differential equations. However, these representations also help in 
evaluating integrals, as the following example shows for integrals from 0 to ~. 


Laplace Integrals 
We shall derive the Fourier cosine and Fourier sine integrals of f(x) = eT, where x > 0 and k > 0 (Fig. 284). 


The result will be used to evaluate the so-called Laplace integrals. 


: 2 
Solution. (a) From (10) we have A(w) = 2| e~*” cos wu dv. Now, by integration by parts, 
0 


= k -k w 
fe KY cos wo dv 3 ae i sin wv + cos wu }. 
k°+w k 


If v = 0, the expression on the right equals —k/ (k2 + w?). Ifv approaches infinity, that expression approaches 
zero because of the exponential factor. Thus 2/77 times the integral from 0 to gives 


2k/ ar 
(12) A(w) = ——. 
k? + w? 


By substituting this into the first integral in (10) we thus obtain the Fourier cosine integral representation 


f(x) = ew ze cos wx ee 
x)=e Ww x : 
T Jy k? + w? 
From this representation we see that 
cos wx 7 
(13) | p= a pe (x>0, k>0). 
0 ke + w? 2k 
_ : es a : . 
(b) Similarly, from (11) we have B(w) = a sin wu dv. By integration by parts, 
0 
—kov |: Ww —ku k + 
e” sin wu du 3 ae sin wv + cos wu J}. 
k“ +w w 
This equals —w/(k? + w?) if v = 0, and approaches 0 as v >. Thus 
2w/77 
(14) B(w) = ——~. 
k? + w? 
From (14) we thus obtain the Fourier sine integral representation 
2 (“w sin wx 
f(x) eke | Ww. 
us 0 k2 + w? 
From this we see that 
“w sin wx TE ee 
(15) —_——_ dw = —e g>0, £20). 
op ke + w? 2 


The integrals (13) and (15) are called the Laplace integrals. | 
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1-6) EVALUATION OF INTEGRALS 


Show that the integral represents the indicated function. 
Hint. Use (5), (10), or (11); the integral tells you which one, 
and its value tells you what function to consider. Show your 
work in detail. 


0 if x<0 


dx = 7/2 


cos xw + w sin xw . 
1. | if x=0 


0 1+w? 
we * if x>0 


OSx5T7 


oo os a To: : 
sin 77w sin xw 3 sinx if 
Wy = 
0 


1—w? 0) if x> 7 


dar if O< x <7 


—— 
0 0 if 


sin xw dw -{ 
KP 


1 : 

” cos 5 TW dq cosx if 0< |x| < 37 
4, cos xw dw = 

0 


1-—w 0 if |x| = dq 
dqx if O<x<1 
. | me sin. xw dw = 40 if x=1 
5 w 
0 if x>1 
6. fw w* sin zw ay, = 57e “cosx if x>0 
0 wtt+4 
7-12 | FOURIER COSINE INTEGRAL 
REPRESENTATIONS 
Represent f(x) as an integral (10). 
1 if O<x<1 
7.40) =4 
0 if x> I 
x if O<x<1 
8. f(x) = { 
0 if pa 


9. f(x) = 1/1. + x?) 


ax? if O<x<a 
10. 09 = { 
0 


if x>a 
sin x 
11. f(x) = { 
0 


et 
12. f(x) = { 
0 


[x > 0. Hint. See (13).] 


if O<x<7 
if x> 7 
if O<x<a 
if x>a 


13. CAS EXPERIMENT. Approximate Fourier Cosine 
Integrals. Graph the integrals in Prob. 7, 9, and 11 as 
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PROBLEM SET 11.7 


functions of x. Graph approximations obtained by 

replacing © with finite upper limits of your choice. 

Compare the quality of the approximations. Write a 

short report on your empirical results and observations. 
14. PROJECT. Properties of Fourier Integrals 

(a) Fourier cosine integral. Show that (10) implies 


(al) flax) = | a() cos xw dw 
0 


(a > 0) (Scale change) 


(a2) xf(x) = | B’(w) sin xw dw, 
0 


A as in (10) 


(a3) x7 f(x) = | A*(w) cos xw dw, 
0 


(b) Solve Prob. 8 by applying (a3) to the result of Prob. 7. 
(c) Verify (a2) for f@~y=1 if O<x<a and 
fx =0 if x>a. 

(d) Fourier sine integral. Find formulas for the Fourier 
sine integral similar to those in (a). 


15. CAS EXPERIMENT. Sine Integral. Plot Si(u) for 
positive u. Does the sequence of the maximum and 
minimum values give the impression that it converges 
and has the limit 77/2? Investigate the Gibbs phenomenon 
graphically. 


16-20 | FOURIER SINE INTEGRAL 


REPRESENTATIONS 
Represent f(x) as an integral (11). 


x if O<x<a 
16. f(x) = i 
i x>a 


1 if O<x<l 
17. f(x) = 
x>1 
cosx if O<x<7 
18. f(x) = co 
M2 
if O<x< 1 
19. f(x) = 
0 em | 
if O<x< 1 
20. f(x) = eo 
0 x>1 
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11.8 Fourier Cosine and Sine Transforms 


An integral transform is a transformation in the form of an integral that produces from 
given functions new functions depending on a different variable. One is mainly interested 
in these transforms because they can be used as tools in solving ODEs, PDEs, and integral 
equations and can often be of help in handling and applying special functions. The Laplace 
transform of Chap. 6 serves as an example and is by far the most important integral 
transform in engineering. 

Next in order of importance are Fourier transforms. They can be obtained from the 
Fourier integral in Sec. 11.7 in a straightforward way. In this section we derive two such 
transforms that are real, and in Sec. 11.9 a complex one. 


Fourier Cosine Transform 


The Fourier cosine transform concerns even functions f(x). We obtain it from the Fourier 
cosine integral [(10) in Sec. 10.7] 


0 


f@ = | A(w) cos wx dw, where A(w) = 2 | f(@) cos wu dv. 
0 0 


Namely, we set A(w) = V 2/7 f.(w), where c suggests “cosine.” Then, writing v = x in 
the formula for A(w), we have 


(1a) f.w) = 2 | f(x) cos wx dx 


0 


and 


(1b) f(x) = 2 | f.(w) cos wx dw. 


0 


Formula (la) gives from f(x) a new function fw), called the Fourier cosine transform 
of f(x). Formula (1b) gives us back f(x) from fw), and we therefore call f(x) the inverse 
Fourier cosine transform of fw). 

The process of obtaining the transform 7 from a given f is also called the Fourier 
cosine transform or the Fourier cosine transform method. 


Fourier Sine Transform 


Similarly, in (11), Sec. 11.7, we set B(w) = V 2/7 fw), where s suggests “‘sine.” Then, 
writing v = x, we have from (11), Sec. 11.7, the Fourier sine transform, of f(x) given by 


(2a) fw) = 2 | f(x) sin wx dx, 
0 
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Fig. 285. f(x) in 
Example 1 


EXAMPLE 2 


and the inverse Fourier sine transform of fy (w), given by 


(2b) f@) = 2 | f,(w) sin wx dw. 


0 


The process of obtaining f,(w) from f(x) is also called the Fourier sine transform or 
the Fourier sine transform method. 
Other notations are 


Tet) = jean ts) sas 
and ¥, | and ¥, ! for the inverses of ¥, and ¥,, respectively. 


Fourier Cosine and Fourier Sine Transforms 


Find the Fourier cosine and Fourier sine transforms of the function 


w= { if O<x<a (Fie. 285 
DY a apg en 


Solution. From the definitions (1a) and (2a) we obtain by integration 


~ a f2 = _ /2_ (sinaw 
fe(w) = = k E cos wx dx = a k a 
n 2 os ae d 25 1 — cos aw 
fs(w) = } sin wx dx = re F 


This agrees with formulas | in the first two tables in Sec. 11.10 (where k = 1). 
Note that for f(x) = k = const (0 < x < ©), these transforms do not exist. (Why?) Bi 


Fourier Cosine Transform of the Exponential Function 
Find ¥,(e~*). 


Solution. By integration by parts and recursion, 


_ 2 f° = 2 @¢* i V2/%7 
F(e-”) = ,/— | e-* cos wx dx x (—cos wx + w sin wx)| = ———5. 
T Jo qTit+w 0 l+w 


This agrees with formula 3 in Table I, Sec. 11.10, with a = 1. See also the next example. =] 


What did we do to introduce the two integral transforms under consideration? Actually 
not much: We changed the notations A and B to get a “symmetric” distribution of the 
constant 2/7 in the original formulas (1) and (2). This redistribution is a standard con- 
venience, but it is not essential. One could do without it. 

What have we gained? We show next that these transforms have operational properties 
that permit them to convert differentiations into algebraic operations (just as the Laplace 
transform does). This is the key to their application in solving differential equations. 
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Linearity, Transforms of Derivatives 


If f(x) is absolutely integrable (see Sec. 11.7) on the positive x-axis and piecewise 
continuous (see Sec. 6.1) on every finite interval, then the Fourier cosine and sine 
transforms of f exist. 

Furthermore, if f and g have Fourier cosine and sine transforms, so does af + bg for 
any constants a and b, and by (la) 


F.(af + bg) = /2/ [af(x) + bg(x)] cos wx dx 


0 


Pe [2 [~ 
a | F(x) cos wx dx + b =| g(x) cos wx dx. 
0 0 


The right side is a#,(f) + b¥,(g). Similarly for ¥,, by (2). This shows that the Fourier 
cosine and sine transforms are linear operations, 


(a) F (af + bg) = ak (f) aT bF(g), 
(b) F(af + bg) = aF(f) + b¥s(g). 


(3) 


Cosine and Sine Transforms of Derivatives 


Let f(x) be continuous and absolutely integrable on the x-axis, let f' (x) be piecewise 
continuous on every finite interval, and let f(x) >0 as x ~~. Then 


@) FA f'@} = wf} - [210 
(b) FAf' O)} = —wF {fo}. 


(4) 


This follows from the definitions and by using integration by parts, namely, 


Ff f'(~)} = /2/ f'(@® cos wx dx 


0 


2 
‘7 Lye COS Wx , 


+ wf F(x) sin wx as| 
2 fe é 
— |Z FO) + wH{FO}: 


0 
Fs{ f' (x)} = 2 f'@® sin wx dx 
0 
= wf F(x) cos wx as| 


— 2 fos sin wx H ; 


= 0 — wH {fix}. a 


io) 


and similarly, 


SEC. 11.8 Fourier Cosine and Sine Transforms 


EXAMPLE 3 


521 


Formula (4a) with f’ instead of f gives (when f’, f” satisfy the respective assumptions 


for f, f’ in Theorem 1) 


” q 2 
FeAlf (x)} = wH ff OO} — /3s (0); 


hence by (4b) 


" 2) , 
(Sa) Half @)) = —w FAlf@} — j2r (0). 
Similarly, 

A 2, 
(Sb) K(f" @)} = —w"F.(f@)} + 270 


A basic application of (5) to PDEs will be given in Sec. 12.7. For the time being we 


show how (5) can be used for deriving transforms. 


An Application of the Operational Formula (5) 


a. 


Find the Fourier cosine transform F (e~~) of f(x) = e— * where a > 0. 


2 -ax, 


Solution. By differentiation, (e~%”)" = a?e~®; thus 
a°fx) = f"@). 
From this, (5a), and the linearity (3a), 


OF f) = Fel f") 


= a2 = 2 a) 
= —w'F(f) al ¢ ) 
2 . [2 
= -w'F(f) +a = 


Hence 
(a2 + w9F( f) = aV2/7. 
The answer is (see Table I, Sec. 11.10) 


_ 2) 
Fle me) ma 7 ( 
T 


a 


a’ +w 


) (a> 0). 


Tables of Fourier cosine and sine transforms are included in Sec. 11.10. 
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1-8 


FOURIER COSINE TRANSFORM 


1. 


Find the cosine transform fw) of f(x) =1 if 
O0<x<1, fw=-l if l<x<2, f@) =0 if 
5 a ie 


. Find fin Prob. 1 from the answer 7 
. Find fw) for fa) =x if O0<x<2, f(x) =0 if 


x > 2: 


4. Derive formula 3 in Table I of Sec. 11.10 by integration. 


7. 


2 Find f.(w) for f(x) =x if0<x<1, f(x) = Oifx >1. 
. Continuity assumptions. Find g.(w) for g(x) = 2 if 


O<x<1, g(x) = Oif x > 1. Try to obtain from it 
fw) for f(x) in Prob. 5 by using (5a). 

Existence? Does the Fourier cosine transform of 
x? sinx (O<x< oo) exist? Of x ~tcosx? Give 
reasons. 


. Existence? Does the Fourier cosine transform of 


f(x) = k = const (0 < x < %) exist? The Fourier sine 
transform? 


11.9 Fourier Transform. 
Discrete and Fast Fourier Transforms 


9. 
10. 
11. 


12. 
13. 


14. 


15. 


FOURIER SINE TRANSFORM 
Find ¥,(e—™), 
Obtain the answer to Prob. 9 from (5b). 


Find f,(w) for f(x) = x if O<x< 1, f@ =0 if 
x > 1. 

Find F(xe7* /2) from (4b) and a suitable formula in 
Table I of Sec. 11.10. 


Find ¥,(e~”) from (4a) and formula 3 of Table I in 
Sec. 11.10. 


a > 0, by integration. 


Gamma function. Using formulas 2 and 4 in Table II 
of Sec. 11.10, prove P(g) = Vz [(30) in App. A3.1], 
a value needed for Bessel functions and other 
applications. 


WRITING PROJECT. Finding Fourier Cosine and 
Sine Transforms. Write a short report on ways of 
obtaining these transforms, with illustrations by 
examples of your own. 


In Sec. 11.8 we derived two real transforms. Now we want to derive a complex transform 
that is called the Fourier transform. It will be obtained from the complex Fourier integral, 


which will be discussed next. 


Complex Form of the Fourier Integral 
The (real) Fourier integral is [see (4), (5), Sec. 11.7] 


f(x) = | [A(w) cos wx + B(w) sin wx] dw 


0 


where 


Aw) = =| f(v) cos wu dv, Bw) = +| 


—7 


oo 


Fv) sin wu dv. 


—02 


Substituting A and B into the integral for f, we have 


1 io] oo 
f(x) = =| | f(v)[cos wu cos wx + sin wu sin wx] du dw. 


0) —0o 
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By the addition formula for the cosine [(6) in App. A3.1] the expression in the brackets 
[---] equals cos (wu — wx) or, since the cosine is even, cos (wx — wu). We thus obtain 


(1*) f(x) = +| | f(v) cos (wx — woe dw. 


0 —0 


The integral in brackets is an even function of w, call it F(w), because cos (wx — wv) is 
an even function of w, the function f does not depend on w, and we integrate with respect 
to v (not w). Hence the integral of F(w) from w = 0 to ~ is 5 times the integral of F(w) 
from —® to ~. Thus (note the change of the integration limit!) 


1 io] eo 
(1) f(x) = | | f(v) cos (wx — wu) «| dw. 
We claim that the integral of the form (1) with sin instead of cos is zero: 
1 oo eo 
(2) | | f(v) sin (wx — wv) ‘| dw = 0. 
277 lites 


This is true since sin (wx — wv) is an odd function of w, which makes the integral in 
brackets an odd function of w, call it G(w). Hence the integral of G(w) from —© to ~ 
is zero, as claimed. 

We now take the integrand of (1) plus i (= \V—1) times the integrand of (2) and use 
the Euler formula [(11) in Sec. 2.2] 


(3) e” = cosx + isinx. 
Taking wx — wu instead of x in (3) and multiplying by f(v) gives 
f(v) cos (wx — wu) + if(v) sin (wx — wv) = fvryeXVF—™, 
Hence the result of adding (1) plus i times (2), called the complex Fourier integral, is 
1 eo) ino) ; - 
(4) 1) | | fvye’? du dw G= V-}). 


To obtain the desired Fourier transform will take only a very short step from here. 


Fourier Transform and Its Inverse 
Writing the exponential function in (4) as a product of exponential functions, we have 


(5) f(x) = ae [ E- fe fwe ‘| el dy, 


The expression in brackets is a function of w, is denoted by f (w), and is called the Fourier 
transform of f; writing v = x, we have 


(6) fwy= = | f@e™ dx. 


—7 


° 
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With this, (5) becomes 


(7) f(x) = Je J fone dw 


and is called the inverse Fourier transform of f(w). 
Another notation for the Fourier transform is 


A 


f= Ff), 


so that 


The process of obtaining the Fourier transform 4(f) = f from a given f is also called 
the Fourier transform or the Fourier transform method. 

Using concepts defined in Secs. 6.1 and 11.7 we now state (without proof) conditions 
that are sufficient for the existence of the Fourier transform. 


Existence of the Fourier Transform 


If f(x) is absolutely integrable on the x-axis and piecewise continuous on every finite 
interval, then the Fourier transform fw) of f(x) given by (6) exists. 


Fourier Transform 
Find the Fourier transform of f(x) = 1 if |x| < 1 and f(x) = 0 otherwise. 


Solution. Using (6) and integrating, we obtain 


1 _. 
x 1 : 1 ee lt 1 = ; 
fw) = | eX dy . (e aw et”), 
V27 Jy V27 —iw!l-1 —iwV27 
As in (3) we have e’”’ = cosw + isin Ww, e” = cos w — isin w, and by subtraction 


mo _ 9 — Ii sin w. 


Substituting this in the previous formula on the right, we see that 7 drops out and we obtain the answer 
4“ 7 sinw 
fw) =. [——. a 
2 w 


Find the Fourier transform ¥(e~%”) of f(x) = e~™ if x > 0 and f(x) = 0 if x < 0; here a > 0. 


Fourier Transform 


Solution. From the definition (6) we obtain by integration 


F (e~*) = 1 | ee twx dx 
V27F Jo 
1 eo at iw)x o 1 
V 27 —(a + iw)!*=90 V 2a (a + iw) 


This proves formula 5 of Table III in Sec. 11.10. 2] 
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Physical Interpretation: Spectrum 


The nature of the representation (7) of f(x) becomes clear if we think of it as a superposition 
of sinusoidal oscillations of all possible frequencies, called a spectral representation. 
This name is suggested by optics, where light is such a superposition of colors 
(frequencies). In (7), the “spectral density” fw) measures the intensity of f(x) in the 
frequency interval between w and w + Aw (Aw small, fixed). We claim that, in connection 
with vibrations, the integral 


| | fowy|? dw 
can be interpreted as the total energy of the physical system. Hence an integral of | a (w)|? 
from a to b gives the contribution of the frequencies w between a and b to the total energy. 


To make this plausible, we begin with a mechanical system giving a single frequency, 
namely, the harmonic oscillator (mass on a spring, Sec. 2.4) 


my" + ky = 0. 


Here we denote time t by x. Multiplication by y’ gives my’y” + ky'y = 0. By integration, 


amu" - a ky” = Eo = const 


where v = y’ is the velocity. The first term is the kinetic energy, the second the potential 
energy, and Eo the total energy of the system. Now a general solution is (use (3) in 
Sec. 11.4 with t = x) 


y = ay cos Wox + by sin wox = cye?™ + c_yo HO, we = k/m 
where cy = (a1 — iby)/2,c_1 = cy = (a1 + iby)/2. We write simply A = cer, 
B= c_ye ““*. Then y = A + B. By differentiation, v = y’ = A’ + B’ = iwo(A — B). 
Substitution of v and y on the left side of the equation for Eo gives 


Eo = gmv” + sky” = dm(iwo)(A — B® + 3k(A + BY. 
Here wa = k/m, as just stated; hence mwa = k. Also i? = —1, so that 
Eo = 5k[—(A — BY? + (A + B)?] = 2KAB = 2k c_ eM = 2keyo_y = 2k|cy|*. 


Hence the energy is proportional to the square of the amplitude |c,|. 

As the next step, if a more complicated system leads to a periodic solution y = f(x) 
that can be represented by a Fourier series, then instead of the single energy term ley|? 
we get a series of squares |c,|" of Fourier coefficients cy, given by (6), Sec. 11.4. In this 
case we have a “discrete spectrum” (or “point spectrum’) consisting of countably many 
isolated frequencies (infinitely many, in general), the corresponding ley |? being the 
contributions to the total energy. 

Finally, a system whose solution can be represented by an integral (7) leads to the above 
integral for the energy, as is plausible from the cases just discussed. 
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Linearity. Fourier Transform of Derivatives 


New transforms can be obtained from given ones by using 


THEOREM 2 Linearity of the Fourier Transform 


The Fourier transform is a linear operation; that is, for any functions f(x) and g(x) 
whose Fourier transforms exist and any constants a and b, the Fourier transform 
of af + bg exists, and 


(8) F(af + bg) = ak (f) + b#(g). 


PROOF This is true because integration is a linear operation, so that (6) gives 


1 ss . 
F{af(x) + bg(x)} = al laf(x) + bg(we""* dx 
J feserim dx +b oe [sto dx 


=a 


V21T 
= ak{ f(x)} + bF{g@}. a 


In applying the Fourier transform to differential equations, the key property is that 
differentiation of functions corresponds to multiplication of transforms by iw: 


THEOREM 3 Fourier Transform of the Derivative of f(x) 


Let f(x) be continuous on the x-axis and f(x) 0 as |x| — %, Furthermore, let f' (x) 
be absolutely integrable on the x-axis. Then 


(9) F lf’ (x)} = iwF {f@)}. 


PROOF From the definition of the Fourier transform we have 


’ 1 ° ’ —iwx 
F{f (x)} = all f' @e dx. 


Integrating by parts, we obtain 


Ff} = 


— (-iw) | f@e ac 


—2 


1 —iwx 
an i (xJe 


Since f(x) > 0 as |x| — 0%, the desired result follows, namely, 


Fl f'()} = 0 + iwF{fOo}. a 
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EXAMPLE 3 


THEOREM 4 


Two successive applications of (9) give 
Ff") = wh (f!) = (iwPF(f). 


Since (iw)? = —w?, we have for the transform of the second derivative of f 
(10) Ff" (@)} = —w°F(f@)}. 


Similarly for higher derivatives. 
An application of (10) to differential equations will be given in Sec. 12.6. For the time 
being we show how (9) can be used to derive transforms. 


Application of the Operational Formula (9) 
Find the Fourier transform of xe" from Table II, Sec 11.10. 


Solution. We use (9). By formula 9 in Table IIT 


ll 

| 

| 
3 


Convolution 


The convolution f + g of functions f and g is defined by 


io) 


(11) h(x) = (f* g)() = | f(p)g (x — p) dp = | f(x — p)g (Pp) ap. 


—2 —01 


The purpose is the same as in the case of Laplace transforms (Sec. 6.5): taking the 
convolution of two functions and then taking the transform of the convolution is the same 
as multiplying the transforms of these functions (and multiplying them by V 277): 


Convolution Theorem 


Suppose that f(x) and g(x) are piecewise continuous, bounded, and absolutely 
integrable on the x-axis. Then 


(12) F( fg) = V2 Ff) F(g). 
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By the definition, 


1 Prien - 
F(fa)= | | free -rrdve OO dx. 


An interchange of the order of integration gives 


1 ice fia ~ 
F(fe9)= =| | fee - re “% dx dp. 


—2 > —o 


Instead of x we now take x — p = g as a new variable of integration. Then x = p + q 
and 


i a ie oe 
F (fx 8) = | | f(p)g (ge ?*® dq dp. 


This double integral can be written as a product of two integrals and gives the desired 
result 


1 “ - = 
F (f * g) -t_| f(pye | g(qve “4 dq 


V27r 
= yalvn F (FMV 2a F (g)] = VIF (fF (a). a 


By taking the inverse Fourier transform on both sides of (12), writing 7 = #(f) and 
& = #(g) as before, and noting that V27 and 1/277 in (12) and (7) cancel each other, 
we obtain 


(13) CGeG)= | fwg we” dw, 


—02 


a formula that will help us in solving partial differential equations (Sec. 12.6). 


Discrete Fourier Transform (DFT), 
Fast Fourier Transform (FFT) 


In using Fourier series, Fourier transforms, and trigonometric approximations (Sec. 11.6) 
we have to assume that a function f(x), to be developed or transformed, is given on some 
interval, over which we integrate in the Euler formulas, etc. Now very often a function f(x) 
is given only in terms of values at finitely many points, and one is interested in extending 
Fourier analysis to this case. The main application of such a “discrete Fourier analysis” 
concerns large amounts of equally spaced data, as they occur in telecommunication, time 
series analysis, and various simulation problems. In these situations, dealing with sampled 
values rather than with functions, we can replace the Fourier transform by the so-called 
discrete Fourier transform (DFT) as follows. 
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Let f(x) be periodic, for simplicity of period 277. We assume that N measurements of 
f(x) are taken over the interval 0 S x S 277 at regularly spaced points 


(14) XE =—, k=O, Low N =, 


We also say that f(x) is being sampled at these points. We now want to determine a 
complex trigonometric polynomial 


N-1 
(15) qx) = dene 
n=0 


that interpolates f(x) at the nodes (14), that is, g(x,) = f(x;,), written out, with fj, denoting 
f@x) 


N-1 
(16) fie = fx) = 9n) = Dene, bo UA oe = DU, 
n=0 


Hence we must determine the coefficients cg, ---, cy— such that (16) holds. We do this 
by an idea similar to that in Sec. 11.1 for deriving the Fourier coefficients by using the 
orthogonality of the trigonometric system. Instead of integrals we now take sums. Namely, 
we multiply (16) by e~“"** (note the minus!) and sum over k from 0 to N — 1. Then we 
interchange the order of the two summations and insert x; from (14). This gives 


N-1 N-1 Nel JN=b 


(17) Shem = = »> De ein MLE — ae nye m)2ark/N- 


=0 n=0 
Now 
eln—mamTk/N _ [eee 
We donote [---] by r. For n = m we have r = e° = 1. The sum of these terms over k 


equals N, the number of these terms. For n # m we have r # | and by the formula for a 
geometric sum [(6) in Sec. 15.1 with g = randn = N — 1] 


N-1 N 
1 — 

Sy re = Y= 

k=0 I< 


r 


because rN = 1; indeed, since k, m, and n are integers, 
rN = e—meak — cos 2ark(n — m) + isin 2ark(n — m) = 1+0=1. 


This shows that the right side of (17) equals c,,N. Writing n for m and dividing by N, we 
thus obtain the desired coefficient formula 


N 
(18*) es 5h ent f= f(xy), n=0,10,N- 1. 


Since computation of the c,, (by the fast Fourier transform, below) involves successive 
halfing of the problem size N, it is practical to drop the factor 1/N from cy and define the 
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discrete Fourier transform of the given signal f = [fg --- fy_1]' to be the vector 
f =[fo °-:: fn—1] with components 
A N-1 ; 
(te) eee f.—f,), n=0,---,N-1. 
k=0 


This is the frequency spectrum of the signal. 
In vector notation, f = Fyf, where the N X N Fourier matrix Fy = [e,;,] has the 
entries [given in (18)] 


(19) nk = ent = e2Tink/N = ie w= wy = err. 


where n,k = 0,:::,N — 1. 


Discrete Fourier Transform (DFT). Sample of N = 4 Values 


Let N = 4 measurements (sample values) be given. Then w = e 27/N — o- 7/2 — _jandthusw™ = (—i)™. 
Let the sample values be, sayf=[0 1 4 gy". Then by (18) and (19), 


w w w w 1 1 i 1 0 14 
; we wh ow w3 1 -i -l i||1 —4 + 8i 
(20) f =F,f= f= = 
wo w2 wt w§ 1 -=1 tb, -=1 4 —6 
w? we we wp? 1 i -1 -i||9 —4-8i 


From the first matrix in (20) it is easy to infer what Fy looks like for arbitrary N, which in practice may be 
1000 or more, for reasons given below. i 


From the DFT (the frequency spectrum) f= F,f we can recreate the given signal 


A = - 1 
f = Fy'f, as we shall now prove. Here Fy and its complex conjugate Fy = N aaa 


satisfy 
(21a) FyFy = FyFy = MI 
where I is the N X N unit matrix; hence Fy has the inverse 


i= 
21b Fy! = — Fy. 
(21b) N nen 


We prove (21). By the multiplication rule (row times column) the product matrix 
Gy = FyFy = [gjx] in (21a) has the entries gj, = Row j of Fy times Column k of Fy. 
That is, writing W = ww", we prove that 

gi = (Wiwk? + (Wiwkyt + +--+ WN 
0 if j#R 


=W+wi+--- +wis =f 
N if j=k. 


SEC. 11.9 Fourier Transform. Discrete and Fast Fourier Transforms 531 


Indeed, when j = k, then wkw" = (ww) = (e27/Ne~27/N)k — 1k = 1, so that the sum 
of these N terms equals N; these are the diagonal entries of Gy. Also, when j # k, then 


W # 1 and we have a geometric sum (whose value is given by (6) in Sec. 15.1 withg = W 
and n = N -1) 


wot+wit--- +wNt =——_=090 


because WN = (wi hyN = (e27i(e-27ye = Fh = 1, - 


We have seen that f is the frequency spectrum of the signal f(x). Thus the components 
tn of f give a resolution of the 27r-periodic function f(x) into simple (complex) harmonics. 
Here one should use only n’s that are much smaller than N/2, to avoid aliasing. By this 
we mean the effect caused by sampling at too few (equally spaced) points, so that, for 
instance, in a motion picture, rotating wheels appear as rotating too slowly or even in the 
wrong sense. Hence in applications, N is usually large. But this poses a problem. Eq. (18) 
requires O(N) operations for any particular n, hence O(N as operations for, say, all 
n < N/2. Thus, already for 1000 sample points the straightforward calculation would 
involve millions of operations. However, this difficulty can be overcome by the so-called 
fast Fourier transform (FFT), for which codes are readily available (e.g., in Maple). The 
FFT is a computational method for the DFT that needs only O(N) logg N operations 
instead of O(N”). It makes the DFT a practical tool for large N. Here one chooses N = 2” 
(p integer) and uses the special form of the Fourier matrix to break down the given problem 
into smaller problems. For instance, when N = 1000, those operations are reduced by a 
factor 1000/logz 1000 ~ 100. 

The breakdown produces two problems of size M = N/2. This breakdown is possible 
because for N = 2M we have in (19) 


we = wey = (e727UN)2 = e472 — 9-271 _ yy, 

The given vectorf=[fo -:: Peal” is split into two vectors with M components each, 
namely, fey=[fo f2 °** fy-2|' containing the even components of f, and fog = 
[fit fg °°: fy-1] containing the odd components of f. For fey and fog we determine 
the DFTs 

f—-(f f a 2 T_Roe 

vo | fev,o fev,2 Sev,n-2] —~ PMtev 

and 

a ee z ry T_ 

foa _ Lfod,1 foa,3 ue Soa,n-1] ~~ Fyfoa 


involving the same M X M matrix Fy. From these vectors we obtain the components of 
the DFT of the given vector f by the formulas 


(a) ti = fovin + whfodn n=0,::-,M-—1 


(22) : 7 . 
(b) fram = fevn ~ WNfoan n=0,:--,M—1. 
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For N = 2? this breakdown can be repeated p — 1 times in order to finally arrive at N/2 
problems of size 2 each, so that the number of multiplications is reduced as indicated 
above. 

We show the reduction from N = 4 to M = N/2 = 2 and then prove (22). 


Fast Fourier Transform (FFT). Sample of N = 4 Values 


When N = 4, then w = wy = —ias in Example 4 and M = N/2 = 2, hence w = wy e 27/2 — eT 1. 
Consequently, 
. Ifo 1 1][f) th 
fey = a ha Fofey = airs f 
fa 1 —-I|lh fo — he 
. [A 1 1fA] [A+s 
foa = a Fofoa = = | : 
fs 1 —I]|Lf Aas 
From this and (22a) we obtain 
fo =fevo + whfoao = (fot A + (htf)=f+ hth fs 
fi Sfx + whfoa, (fo — fe) — (A + fs) =fo— fi — fa + ifs. 
Similarly, by (22b), 
fa =fevo — Whfoao = (fo +f) — (At fs) =fo-fth— fs 
b = fea e whfoas (fo — fa) — (-D(A — fs) = fo + fh — fe — tf. 
This agrees with Example 4, as can be seen by replacing 0, 1, 4, 9 with fo, fi, fa, fa. i] 


We prove (22). From (18) and (19) we have for the components of the DFT 
. N-1 ‘ 
In = > WN k: 
k=0 
Splitting into two sums of M = N/2 terms each gives 
7 M-1 M-1 
fa = Dwi fon + DS weet eon. 
k=0 = 


k=0 


We now use wh = wy and pull out wy from under the second sum, obtaining 


M-1 M-1 
c k k 
(23) fn = Swit fev + WN >, WM foa,k- 
k=0 k=0 


The two sums are fey,» and fog, the components of the “half-size” transforms Ff, and 
Ffog. 
Formula (22a) is the same as (23). In (22b) we have n + M instead of n. This causes 
a sign changes in (23), namely —wy before the second sum because 
wM = eT 2TiM/N _ 2-2mi/2 _ 


| 


This gives the minus in (22b) and completes the proof. a 
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PROBLEM SET 1T1-9 


1. Review in complex. Show that 1/i = —i, eo = 12-17} USE OF TABLE III IN SEC. 11.10. 
cosx—isinx, e”+e ~=2cosx, e“~-e “= OTHER METHODS 
BUSES GO COREE i ERM Re 12. Find ¥(f(x)) for f(x) = xe~* if x > 0, f(x) = 0 if 
x <0, by (9) in the text and formula 5 in Table III 
2-11] FOURIER TRANSFORMS BY (with a = 1). Hint. Consider xe~* and e~”. 


INTEGRATION 
Find the Fourier transform of f(x) (without using Table 
Ill in Sec. 11.10). Show details. 


i fe if -l<x<1l 
f(x) = 
0) otherwise 
3. fe) {, if a<x<b 
fx = 
f 0 otherwise 
re a if x<O0 (k>0) 
~fx= 
: 0 if x>0 
5. F(a) { if -a<x<a 
fx = 
0 otherwise 
6. f(x) =e ltl (0 <x < w) 
7. fe) x if O<x<a 
~fa~= 
0 otherwise 
8. F(a) xe” if -1<x<0 
fx = 
, 0 otherwise 
lIx| if -l<x<1 


0) otherwise 


eso * 
| 


-l<x<l 


QO otherwise 


17. 


. Obtain ¥(e~”/) from Table IIT. 

. In Table III obtain formula 7 from formula 8. 

. In Table III obtain formula 1 from formula 2. 

. TEAM PROJECT. Shifting (a) Show that if f(x) 


has a Fourier transform, so does f(x — a), and 
{f(x — a)} =e Ff}. 

(b) Using (a), obtain formula 1 in Table III, Sec. 11.10, 
from formula 2. 

(c) Shifting on the w-Axis. Show that if f(w) is the 
Fourier transform of f(x), then if (w — a) is the Fourier 
transform of e’“*f(x). 

(d) Using (c), obtain formula 7 in Table III from 1 and 
formula 8 from 2. 

What could give you the idea to solve Prob. 11 by using 
the solution of Prob. 9 and formula (9) in the text? 
Would this work? 


18-25 


DISCRETE FOURIER TRANSFORM 


18. 
19. 


20. 


21. 


22. 


23. 


24. 


25. 


Verify the calculations in Example 4 of the text. 
Find the transform of a 
f=th f& B fal’ of four values. 


Find the inverse matrix in Example 4 of the text and 
use it to recover the given signal. 


general signal 


Find the transform (the frequency spectrum) of a 
general signal of two values [f, fal". 

Recreate the given signal in Prob. 21 from the 
frequency spectrum obtained. 

Show that for a signal of eight sample values, 
w=e = - i)/ V2. Check by squaring. 

Write the Fourier matrix F for a sample of eight values 
explicitly. 

CAS Problem. Calculate the inverse of the 8 X 8 
Fourier matrix. Transform a general sample of eight 
values and transform it back to the given data. 
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11.10 Tables of Transforms 


Table I. 


See (2) in Sec. 11.8. 


Fourier Cosine Transforms 


f(x) 


OO = Fey 


{, if O<x<a 


0 otherwise 


2 | x*! O<a<1) 


5 | e™ (a>0) 

6 x"e™ (a>0) 
0 otherwise 
8 cos (ax?) (a > 0) 


9 sin (ax?) (a > 0) 


10 a @ =O) 
a 
ll e sin x 
xX 


12 Jo(ax) (a> 0) 


fon if0<x<a 


2 sin aw 
T ow 


(['(a) see App. A3.1.) 


1 ap 
—e w*/(4a) 


V 2a 
2 n! - \ntl Re = 

aif = @ z wera Re(a + iw) Real part 
1 sina(l—w) — sina(1 + w) 

val l1—w . l+w | 


—(1 —- uw - a)) (See Sec. 6.3.) 
= arctan —> 
V2 w2 


{2 1 
5 5 (1 — u(w — a)) (See Secs. 5.5, 6.3.) 
T Va —w 
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Table Il. 


See (5) in Sec. 11.8. 


Fourier Sine Transforms 
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f@) 
i {' if0<x<a 
OQ otherwise 
2 | I/vx 
3 1/x3/2 
i) a? O29~= th 


6 P (a > 0) 
7 x"e"™ (a>0) 
8 xe? /? 


9 | xe" (a>0) 


sinx if0<x<a 
0 | 
0 otherwise 
ul —— (a > 0) 


2 
12 arctan (a > 0) 


fs(w) = ¥s(f) 


2 1 — cos aw 
T Ww 
1/Vw 
2Vw 
2T°@ . ar 
—— sin —— 
T 2 


2 w 
/— arctan — 
7 a 
2 n! 
Im(a + iw)" 
2 (a2 + w?)"*1 ( ) 


—w?/2 


we 


W —w?/4a 
é 


(['(a) see App. A3.1.) 


(2a)3/2 
iT sina(l — w) © sina(1 + w) 
eal l-—w l+w 


Im = 
Imaginary part 


(See Sec. 6.3.) 
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Table Ill. Fourier Transforms 


See (6) in Sec. 11.9. 


f@) 
i . if-b<x<b 
0 otherwise 
1 ifb<x<c 
2 
OQ otherwise 
3 (a > 0) 


x2 + a? 


x ifO<x<b 
2x-b ifb<x< 2b 


0 otherwise 


| 
4 eee ae 


0 otherwise 


ew ifb<x<c 


QO otherwise 


eb if-—b<x<b 


0) otherwise 


e ifb<x<c 


0 otherwise 


9 ee (a>0) 


10 


(a > 0) 


fw) = Ff) 


2 sin bw 
T ow 


=] + Qeibw _ eo 2ibw 


V2Tw2 


1 
V21r(a + iw) 


e6a—twie _ ela twr 


V 27a — iw) 


2 sin b(w — a) 
T wa 


ebta-w) _ eita-w) 


i 


V2T a-w 


i iflwi <a: Oitlwl >a 


Chapter 11 Review Questions and Problems 


1. What is a Fourier series? A Fourier cosine series? A 
half-range expansion? Answer from memory. 


2. What are the Euler formulas? By what very important 
idea did we obtain them? 


3. How did we proceed from 27r-periodic to general- 
periodic functions? 

4. Can a discontinuous function have a Fourier series? A 
Taylor series? Why are such functions of interest to the 
engineer? 

5. What do you know about convergence of a Fourier 
series? About the Gibbs phenomenon? 

6. The output of an ODE can oscillate several times as 
fast as the input. How come? 

7. What is approximation by trigonometric polynomials? 
What is the minimum square error? 

8. What is a Fourier integral? A Fourier sine integral? 
Give simple examples. 

9. What is the Fourier transform? The discrete Fourier 
transform? 

10. What are Sturm—Liouville problems? By what idea are 
they related to Fourier series? 


FOURIER SERIES. In Probs. 11, 13, 16, 20 find 
the Fourier series of f(x) as given over one period and 
sketch f(x) and partial sums. In Probs. 12, 14, 15, 17-19 
give answers, with reasons. Show your work detail. 


0 if -2<x<0 
11. f(x) = 
2 if O<x<2 
12. Why does the series in Prob. 11 have no cosine terms? 


0 if -l<x<0 


13.0) = { 0<x<l 


x if 
14. What function does the series of the cosine terms in 
Prob. 13 represent? The series of the sine terms? 

15. What function do the series of the cosine terms and the 
series of the sine terms in the Fourier series of 

e* (—5 <x < 5) represent? 


16. f(x) = |x| (-7 <x<7) 
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17. Find a Fourier series from which you can conclude that 
1 — 1/3 + 1/5 — 1/7 4 - = 7/4, 


18. What function and series do you obtain in Prob. 16 by 


(termwise) differentiation? 
19. Find the 

(0<x< 1). 
20. f(x) = 3x2 (-47 <x< 7) 


21-22 | GENERAL SOLUTION 


Solve, y” + wy = r(t), where |w| # 0,1, 2,---, r(f) is 
277-periodic and 

21. r(t) = 30? (-a7 <t< 7) 

22. r(t) = |t| (-7 <t<7) 


half-range expansions of f(x) = x 


23-25 | MINIMUM SQUARE ERROR 


23. Compute the minimum square error for f(x) = x/7 
(-7 <x< 77) and trigonometric polynomials of 
degree N = 1,---,5. 

24. How does the minimum square error change if you 
multiply f(x) by a constant k? 

25. Same task as in Prob. 23, for f(x) = |x|/a 
(-—7 <x <7). Why is E* now much smaller (by a 
factor 100, approximately!)? 


26-30 | FOURIER INTEGRALS AND TRANSFORMS 


Sketch the given function and represent it as indicated. If you 


have a CAS, graph approximate curves obtained by replacing 

co with finite limits; also look for Gibbs phenomena. 

26. f(x) =x + 1if0 <x <1 and O otherwise; by the 
Fourier sine transform 

27. f(x) = xif 0 <x < 1 and 0 otherwise; by the Fourier 
integral 

28. f(x) = kx ifa < x < band 0 otherwise; by the Fourier 
transform 

29. f(x) = xif 1 <x <a and 0 otherwise; by the Fourier 
cosine transform 

30. f(x) = e~?” if x > 0 and 0 otherwise; by the Fourier 
transform 
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SUMMARY-OF-CHAPTER-L1 


Fourier Analysis. Partial Differential Equations (PDEs) 


Fourier series concern periodic functions f(x) of period p = 2L, that is, by 
definition f(x + p) = f(x) for all x and some fixed p > 0; thus, f(x + np) = f(x) 
for any integer n. These series are of the form 


(1) f@=at+> («, cos x + b,, sin ) (Sec. 11.2) 
n=1 L L 


with coefficients, called the Fourier coefficients of f(x), given by the Euler formulas 
(Sec. 11.2) 


iB L 
1 1 nix 
do = OL | seo dx, An = L | 0 cos 7 
2) : 
1 . NTX 
by = L | F(@) sin ra dx 


-L 


where n = 1, 2,---. For period 277 we simply have (Sec. 11.1) 


(1*) F(x) = dg + ‘ (a, cos nx + by, sin nx) 


n=1 
with the Fourier coefficients of f(x) (Sec. 11.1) 


1 


w= | f(x) dx, m=+| f(x) cos nx dx, m=+| f(x) sin nx dx. 


=a 


Fourier series are fundamental in connection with periodic phenomena, particularly 
in models involving differential equations (Sec. 11.3, Chap, 12). If f(x) is even 
[f(—x) = f(x)] or odd [f(—x) = —f()], they reduce to Fourier cosine or Fourier 
sine series, respectively (Sec. 11.2). If f(x) is given for 0 S x S L only, it has two 
half-range expansions of period 2L, namely, a cosine and a sine series (Sec. 11.2). 

The set of cosine and sine functions in (1) is called the trigonometric system. 
Its most basic property is its orthogonality on an interval of length 2L; that is, for 
all integers m and n # m we have 


L i 
mTTX nTTX . MTX , nix 
| cos cos L dx = 0, | sin sin dx = 0 


L 
-L -L 


and for all integers m and n, 


& 
MTX . nITXx 
cos sin dx = 0. 
ae: a 


This orthogonality was crucial in deriving the Euler formulas (2). 


Summary of Chapter 11 
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Partial sums of Fourier series minimize the square error (Sec. 11.4). 

Replacing the trigonometric system in (1) by other orthogonal systems first leads 
to Sturm-Liouville problems (Sec. 11.5), which are boundary value problems for 
ODEs. These problems are eigenvalue problems and as such involve a parameter 
A that is often related to frequencies and energies. The solutions to Sturm—Liouville 
problems are called eigenfunctions. Similar considerations lead to other orthogonal 
series such as Fourier-Legendre series and Fourier—Bessel series classified as 
generalized Fourier series (Sec. 11.6). 

Ideas and techniques of Fourier series extend to nonperiodic functions f(x) defined 
on the entire real line; this leads to the Fourier integral 


(3) f(x) = | [A(w) cos wx + B(w) sin wx] dw (Sec. 11.7) 
0 
where 


oo 


(4) A(w) = +| f(v) cos wu dv, Bw) = =| Fv) sin wu dv 


—2 —0 


or, in complex form (Sec. 11.9), 


(5) fix) = a J Fone ii (= V=1) 
where 
(6) fw) = oe [ fovee de. 


Formula (6) transforms f(x) into its Fourier transform f (w), and (5) is the inverse 
transform. 
Related to this are the Fourier cosine transform (Sec. 11.8) 


(7) a (w) = 2I (x) cos wx dx 
0 


and the Fourier sine transform (Sec. 11.8) 


oa) 


(8) fw) = 3 f(x) sin wx dx. 


0 


The discrete Fourier transform (DFT) and a practical method of computing it, 
called the fast Fourier transform (FFT), are discussed in Sec. 11.9. 


CHAPTER ] 2 


Partial Differential 
Equations (PDEs) 


A PDE is an equation that contains one or more partial derivatives of an unknown function 
that depends on at least two variables. Usually one of these deals with time ¢ and the 
remaining with space (spatial variable(s)). The most important PDEs are the wave 
equations that can model the vibrating string (Secs. 12.2, 12.3, 12.4, 12.12) and the 
vibrating membrane (Secs. 12.8, 12.9, 12.10), the heat equation for temperature in a bar 
or wire (Secs. 12.5, 12.6), and the Laplace equation for electrostatic potentials (Secs. 
12.6, 12.10, 12.11). PDEs are very important in dynamics, elasticity, heat transfer, 
electromagnetic theory, and quantum mechanics. They have a much wider range of 
applications than ODEs, which can model only the simplest physical systems. Thus PDEs 
are subjects of many ongoing research and development projects. 

Realizing that modeling with PDEs is more involved than modeling with ODEs, we 
take a gradual, well-planned approach to modeling with PDEs. To do this we carefully 
derive the PDE that models the phenomena, such as the one-dimensional wave equation 
for a vibrating elastic string (say a violin string) in Sec. 12.2, and then solve the PDE 
in a separate section, that is, Sec. 12.3. In a similar vein, we derive the heat equation in 
Sec. 12.5 and then solve and generalize it in Sec. 12.6. 

We derive these PDEs from physics and consider methods for solving initial and 
boundary value problems, that is, methods of obtaining solutions which satisfy the 
conditions required by the physical situations. In Secs. 12.7 and 12.12 we show how PDEs 
can also be solved by Fourier and Laplace transform methods. 


COMMENT. Numerics for PDEs is explained in Secs. 21.4—21.7, which, for greater 
teaching flexibility, is designed to be independent of the other sections on numerics in 
Part E. 


Prerequisites: Linear ODEs (Chap. 2), Fourier series (Chap. 11). 
Sections that may be omitted in a shorter course: 12.7, 12.10-12.12. 
References and Answers to Problems: App. 1 Part C, App. 2. 


12.1 Basic Concepts of PDEs 
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A partial differential equation (PDE) is an equation involving one or more partial 
derivatives of an (unknown) function, call it u, that depends on two or more variables, 
often time ¢ and one or several variables in space. The order of the highest derivative is 
called the order of the PDE. Just as was the case for ODEs, second-order PDEs will be 
the most important ones in applications. 
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EXAMPLE 1 


THEOREM -1 


Just as for ordinary differential equations (ODEs) we say that a PDE is linear if it is 
of the first degree in the unknown function u and its partial derivatives. Otherwise we 
call it nonlinear. Thus, all the equations in Example | are linear. We call a linear PDE 
homogeneous if each of its terms contains either uw or one of its partial derivatives. 
Otherwise we call the equation nonhomogeneous. Thus, (4) in Example | (with f not 
identically zero) is nonhomogeneous, whereas the other equations are homogeneous. 


Important Second-Order PDEs 


Oru 2 WU . : . 
(1) — = Cs One-dimensional wave equation 

at ax? 

ou 2 a7u ; . : 
(2) — =O ses One-dimensional heat equation 

or ox 

au au : . 
(3) —. = 10 Two-dimensional Laplace equation 

ax ay” 

au a7u : : : : : 
(4) —> = Fy) Two-dimensional Poisson equation 

ax ay” 

Onu 2 au au . . : 
(5) c + Two-dimensional wave equation 

ar? ax? ay? 

au au au . . A 
(6) — +— +— =0 Three-dimensional Laplace equation 

axz ay? az 


Here c is a positive constant, f is time, x, y, z are Cartesian coordinates, and dimension is the number of these 
coordinates in the equation. B 


A solution of a PDE in some region R of the space of the independent variables is a 
function that has all the partial derivatives appearing in the PDE in some domain D 
(definition in Sec. 9.6) containing R, and satisfies the PDE everywhere in R. 

Often one merely requires that the function is continuous on the boundary of R, has 
those derivatives in the interior of R, and satisfies the PDE in the interior of R. Letting R 
lie in D simplifies the situation regarding derivatives on the boundary of R, which is then 
the same on the boundary as it is in the interior of R. 

In general, the totality of solutions of a PDE is very large. For example, the functions 


(7) uw=x?- ae u =e" cosy, u = sinx cosh y, u = In(x? + y’) 


which are entirely different from each other, are solutions of (3), as you may verify. We 
shall see later that the unique solution of a PDE corresponding to a given physical problem 
will be obtained by the use of additional conditions arising from the problem. For instance, 
this may be the condition that the solution u assume given values on the boundary of the 
region R (“boundary conditions”). Or, when time fis one of the variables, u (or uz, = du/dt 
or both) may be prescribed at t = 0 (“initial conditions’). 

We know that if an ODE is linear and homogeneous, then from known solutions we 
can obtain further solutions by superposition. For PDEs the situation is quite similar: 


Fundamental Theorem on Superposition 


If uy and ug are solutions of a homogeneous linear PDE in some region R, then 
U = CyUy + Colla 


with any constants cy and Cg is also a solution of that PDE in the region R. 


542 


EXAMPLE 2 
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The simple proof of this important theorem is quite similar to that of Theorem | in Sec. 2.1 
and is left to the student. 

Verification of solutions in Probs. 2-13 proceeds as for ODEs. Problems 16—23 concern 
PDEs solvable like ODEs. To help the student with them, we consider two typical examples. 
Solving u,, — u = 0 Like an ODE 
Find solutions u of the PDE uy. — u = 0 depending on x and y. 


Solution. Since no y-derivatives occur, we can solve this PDE like u" — u = 0. In Sec. 2.2 we would have 
obtained u = Ae” + Be~* with constant A and B. Here A and B may be functions of y, so that the answer is 


u(x, y) = A(y)e* + BOye™ 
with arbitrary functions A and B. We thus have a great variety of solutions. Check the result by differentiation. I 
Solving u,, = —u, Like an ODE 


Find solutions u = u(x, y) of this PDE. 


Solution. Setting u, = p, we have Py=—P, Py/p=—\, In |p| = —y + ci), p = c(xe7® and by 
integration with respect to x, 


u(x, y) = fixe” + g(y) where Sf) = | c(x) dx, 


here, f(x) and g(y) are arbitrary. B 


PROBLEEM—SET 12-1 


1. Fundamental theorem. Prove it for second-order 13. u = x/(x? + y?), y/(x? + y?) 
PDEs in two and three independent variables. Hint. 14. TEAM PROJECT. Verification of Solutions 


Prove it by substitution. 


2-13| VERIFICATION OF SOLUTIONS 


(a) Wave equation. Verify that u(x, tf) = v(x + ct) + 
w(x — ct) with any twice differentiable functions v and 


Verifiy (by substitution) that the given function is a solution w satisfies (1). 
of the PDE. Sketch or graph the solution as a surface in space. (b) Poisson equation. Verify that each u satisfies (4) 
with f(x, y) as indicated. 
2-5| Wave Equation (1) with suitable c s 
er u = y/x f = 2y/x 
ee a u = sin xy f=? + y?) sin xy 
3. u = cos 4f sin 2x peer oy f=402 + yer -# 
4. u = sin kct cos kx ao /Vx2 + y? f= (x2 a y?) 3/2 
ky ee (c) Laplace equation. Verify that 
6-9 Heat Equation (2) with suitable c | [Vx2+y24+ 2 satisfies (6) and 
6. u=e u = In (x? + y?) satisfies (3). Is u = 1/ V x2 + y? a 
7 ye cos wx solution of (3)? Of what Poisson equation? 
8 n=< (d) Verify that u with any (sufficiently often differ- 
rene entiable) uv and w satisfies the given PDE. 
. u = v(x) + w(y) Uxy = O 
10-13 | Laplace Equation (3) “= vw) Ugy = Ugly 
10. uw = e* cos y, e* sin y u = v(x + 21) + w(x - 20) Ure = 4 tbsp 
11. w = arctan (y/x) 15. Boundary value problem. Verify that the function 
12. u = cos y sinh x, sin y cosh x u(x, y) =aln (x? + y?) + bsatisfies Laplace’s equation 
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(3) and determine a and Db so that w satisfies the 
boundary conditions u=110 on the circle 
x2 + y? = | and u = 0 on the circle x7 + y? = 100. 


16-23 | PDEs SOLVABLE AS ODEs 


This happens if a PDE involves derivatives with respect to 
one variable only (or can be transformed to such a form), 
so that the other variable(s) can be treated as parameter(s). 
Solve for u = u(x, y): 


16. Uy, = 0 


17. yy + 16777u = 0 
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18. 25uy, — 4u = 0 19. uy + y7u = 0 
20. 2uxy + Quy, + 4u = —3 cos x 
21. Uy + 6uy + 13u = 4e3Y 

23. x" Uae + 2xu, — 2u = 0 
24. Surface of revolution. Show that the solutions z = 
z(x, y) of yZ_ = xZy represent surfaces of revolution. Give 


examples. Hint. Use polar coordinates r, 8 and show that 
the equation becomes z, = 0. 


25. System of PDEs. Solve uz, = 0, uyy = 0 


29 sin x 


22. Uzy = Ux 


12.2 Modeling: Vibrating String, Wave Equation 


In this section we model a vibrating string, which will lead to our first important PDE, 
that is, equation (3) which will then be solved in Sec. 12.3. The student should pay very 
close attention to this delicate modeling process and detailed derivation starting from 
scratch, as the skills learned can be applied to modeling other phenomena in general and 
in particular to modeling a vibrating membrane (Sec. 12.7). 

We want to derive the PDE modeling small transverse vibrations of an elastic string, such 
as a violin string. We place the string along the x-axis, stretch it to length L, and fasten it 
at the ends x = 0 and x = L. We then distort the string, and at some instant, call it t = 0, 
we release it and allow it to vibrate. The problem is to determine the vibrations of the string, 
that is, to find its deflection u(x, ft) at any point x and at any time t > 0; see Fig. 286. 

u(x, t) will be the solution of a PDE that is the model of our physical system to be 
derived. This PDE should not be too complicated, so that we can solve it. Reasonable 
simplifying assumptions (just as for ODEs modeling vibrations in Chap. 2) are as follows. 


Physical Assumptions 


1. The mass of the string per unit length is constant (“homogeneous string”). The string 
is perfectly elastic and does not offer any resistance to bending. 


2. The tension caused by stretching the string before fastening it at the ends is so large 
that the action of the gravitational force on the string (trying to pull the string down 


a little) can be neglected. 


3. The string performs small transverse motions in a vertical plane; that is, every 
particle of the string moves strictly vertically and so that the deflection and the slope 
at every point of the string always remain small in absolute value. 


Under these assumptions we may expect solutions u(x, f) that describe the physical 


reality sufficiently well. 


| 
| 
| 
| 
| 
! 
x 


Fig. 286. Deflected string at fixed time t. Explanation on p. 544 
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Derivation of the PDE of the Model 
(“Wave Equation”) from Forces 


The model of the vibrating string will consist of a PDE (“wave equation’) and additional 
conditions. To obtain the PDE, we consider the forces acting on a small portion of the 
string (Fig. 286). This method is typical of modeling in mechanics and elsewhere. 

Since the string offers no resistance to bending, the tension is tangential to the curve 
of the string at each point. Let 7 and 75 be the tension at the endpoints P and Q of that 
portion. Since the points of the string move vertically, there is no motion in the horizontal 
direction. Hence the horizontal components of the tension must be constant. Using the 
notation shown in Fig. 286, we thus obtain 


(1) T, cos a = Th cos B = T = const. 


In the vertical direction we have two forces, namely, the vertical components — 7, sin a 
and 7 sin B of 7, and 7; here the minus sign appears because the component at P is 
directed downward. By Newton’s second law (Sec. 2.4) the resultant of these two forces 
is equal to the mass pAx of the portion times the acceleration a7u/ at”, evaluated at some 
point between x and x + Ax; here p is the mass of the undeflected string per unit length, 
and Ax is the length of the portion of the undeflected string. (A is generally used to denote 
small quantities; this has nothing to do with the Laplacian V7, which is sometimes also 
denoted by A.) Hence 


a2 
To sin B — 7 sina = pax". 
t 


Using (1), we can divide this by 73 cos B = TF, cos a = T, obtaining 


Ty sin T, sin Ax du 
aie yeme tan B tana = 2 * Sy 


Ty cos B Ty cos @ T ot 


(2) 


Qs 


Now tan a and tan B are the slopes of the string at x and x + Ax: 


) ) 
tana = (2) and tan B = (24) 
Ox / |» Ox 


Here we have to write partial derivatives because u also depends on time t. Dividing (2) 


by Ax, we thus have 
ul)... Ge) 
Ax|\O%/ laa Ag ox 


If we let Ax approach zero, we obtain the linear PDE 


x+Ax 


= a7u 
2: 
~| Tat 


2 2 
0 0 L 
(3) eS ea-. 

at ax pP 


This is called the one-dimensional wave equation. We see that it is homogeneous and 
of the second order. The physical constant T/p is denoted by c? (instead of c) to indicate 
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that this constant is positive, a fact that will be essential to the form of the solutions. “One- 
dimensional” means that the equation involves only one space variable, x. In the next 
section we shall complete setting up the model and then show how to solve it by a general 
method that is probably the most important one for PDEs in engineering mathematics. 


12.3. Solution by Separating Variables. 
Use of Fourier Series 


We continue our work from Sec. 12.2, where we modeled a vibrating string and obtained 
the one-dimensional wave equation. We now have to complete the model by adding 
additional conditions and then solving the resulting model. 

The model of a vibrating elastic string (a violin string, for instance) consists of the one- 
dimensional wave equation 


T 
a 
Is 
D 
T 


2 
T 
(1) Onu 2 OU 2 E 


for the unknown deflection u(x, f) of the string, a PDE that we have just obtained, and 
some additional conditions, which we shall now derive. 

Since the string is fastened at the ends x = 0 and x = L (see Sec. 12.2), we have the 
two boundary conditions 


(2) (a) uO, = 0, (b) u(Z, t) = 0, for all t = 0. 


Furthermore, the form of the motion of the string will depend on its initial deflection 
(deflection at time ¢ = 0), call it f(x), and on its initial velocity (velocity at t = 0), call it 
g(x). We thus have the two initial conditions 


(3) (a) u@,0)=f@), (0) 4@0=g@) OSxSD) 


where uz; = du/dt. We now have to find a solution of the PDE (1) satisfying the conditions 
(2) and (3). This will be the solution of our problem. We shall do this in three steps, as 
follows. 

Step 1. By the “method of separating variables” or product method, setting 
u(x, t) = F(x)G(t), we obtain from (1) two ODEs, one for F(x) and the other one 
for G(?). 

Step 2. We determine solutions of these ODEs that satisfy the boundary conditions (2). 
Step 3. Finally, using Fourier series, we compose the solutions found in Step 2 to obtain 
a solution of (1) satisfying both (2) and (3), that is, the solution of our model of the 
vibrating string. 


Step 1. Two ODEs from the Wave Equation (1) 


In the method of separating variables, or product method, we determine solutions of the 
wave equation (1) of the form 


(4) u(x, t) = F(x)G(0) 
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which are a product of two functions, each depending on only one of the variables x and ¢. 
This is a powerful general method that has various applications in engineering mathematics, 
as we shall see in this chapter. Differentiating (4), we obtain 


where dots denote derivatives with respect to t and primes derivatives with respect to x. 
By inserting this into the wave equation (1) we have 


FG = c°F"G. 
Dividing by c?FG and simplifying gives 


G F" 


eG F° 
The variables are now separated, the left side depending only on ¢ and the right side only 
on x. Hence both sides must be constant because, if they were variable, then changing f 
or x would affect only one side, leaving the other unaltered. Thus, say, 


GF" 


CG F 


Multiplying by the denominators gives immediately two ordinary DEs 


(5) F" —kF=0 
and 
(6) G—ciKe —0. 


Here, the separation constant k is still arbitrary. 


Step 2. Satisfying the Boundary Conditions (2) 


We now determine solutions F and G of (5) and (6) so that u = FG satisfies the boundary 
conditions (2), that is, 


(7) u(0, t) = F(O)G(A) = 0, u(L, t) = F(L)G(t) = 0 for all ¢. 


We first solve (5). If G = 0, then u = FG = 0, which is of no interest. Hence G # 0 
and then by (7), 


(8) (a) FO)=0, (b) FUL) =0. 


We show that k must be negative. For k = 0 the general solution of (5) is F = ax + b, 
and from (8) we obtain a = b = 0, so that F = 0 and u = FG = 0, which is of no interest. 
For positive k = jie a general solution of (5) is 


F = Ae“” + Be” 
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and from (8) we obtain F = 0 as before (verify!). Hence we are left with the possibility 
of choosing k negative, say, k = —p”. Then (5) becomes F” + p'F = 0 and has as a 
general solution 


F(x) = A cos px + B sin px. 


From this and (8) we have 
F(O)=A=0 and then F(L) = BsinpL = 0. 


We must take B # 0 since otherwise F = 0. Hence sin pL = 0. Thus 


ni 


(9) pL=nqT7, so that p= 


(n integer). 


Setting B = 1, we thus obtain infinitely many solutions F(x) = Fy, (x), where 
(10) Fy (x) = sin a (n= 1; 2,*+*). 


These solutions satisfy (8). [For negative integer n we obtain essentially the same solutions, 
except for a minus sign, because sin (—a) = —sina.] 
We now solve (6) with k = —p” = —(nt/L)* resulting from (9), that is, 


Cntr 


(11*) G+A2G=0 — where An = cp = 


A general solution is 
Gy(t) = By, cos Ant + BX sin Apt. 


Hence solutions of (1) satisfying (2) are uy(x, 1) = FrQ)Gy(t) = Gr(OF n(x), written out 
‘ Ls 
(11) Uy (x, t) = (By cos Apt + BF sin A,f) sin Tt (n = 1,2,-°-). 


These functions are called the eigenfunctions, or characteristic functions, and the values 
Ay = cnt/L are called the eigenvalues, or characteristic values, of the vibrating string. 
The set {Ay, Ao, --: } is called the spectrum. 


Discussion of Eigenfunctions. We see that each u,, represents a harmonic motion having 
the frequency \,,/277 = cn/2L cycles per unit time. This motion is called the nth normal 
mode of the string. The first normal mode is known as the fundamental mode (n = 1), 
and the others are known as overtones; musically they give the octave, octave plus fifth, 
etc. Since in (11) 


sin = 0 at x= nae Ls 


the nth normal mode has n — | nodes, that is, points of the string that do not move (in 
addition to the fixed endpoints); see Fig. 287. 


548 


CHAP. 12 Partial Differential Equations (PDEs) 


2 &. aoe Cae 
nel n=2 n=3 n=4 


Fig. 287. Normal modes of the vibrating string 


Figure 288 shows the second normal mode for various values of ¢. At any instant the 
string has the form of a sine wave. When the left part of the string is moving down, the 
other half is moving up, and conversely. For the other modes the situation is similar. 


Tuning is done by changing the tension T. Our formula for the frequency A,,/277 = cn/2L 
of u, with c = VT/p [see (3), Sec. 12.2] confirms that effect because it shows that the 
frequency is proportional to the tension. T cannot be increased indefinitely, but can you 
see what to do to get a string with a high fundamental mode? (Think of both L and p.) 
Why is a violin smaller than a double-bass? 


Fig. 288. Second normal mode for various values of t 


Step 3. Solution of the Entire Problem. Fourier Series 


The eigenfunctions (11) satisfy the wave equation (1) and the boundary conditions (2) 
(string fixed at the ends). A single u,, will generally not satisfy the initial conditions (3). 
But since the wave equation (1) is linear and homogeneous, it follows from Fundamental 
Theorem | in Sec. 12.1 that the sum of finitely many solutions uw, is a solution of (1). To 
obtain a solution that also satisfies the initial conditions (3), we consider the infinite series 
(with A, = cn7r/L as before) 


(12) u(x.) = Sune, 0 = > (By cos Ant + Be sin Apf) sin ee 


n=1 n=1 


Satisfying Initial Condition (3a) (Given Initial Displacement). From (12) and (3a) 
we obtain 


(13) u(x,0) = SB, sin ie = f(x). C= 221), 
n=1 


Hence we must choose the B,,’s so that u(x, 0) becomes the Fourier sine series of f(x). 
Thus, by (4) in Sec. 11.3, 


L 
(14) Bn = | f@) sin dr, p29 o% 
0 


een) 
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Satisfying Initial Condition (3b) (Given Initial Velocity). Similarly, by differentiating 
(12) with respect to ¢ and using (3b), we obtain 


Ou 
ot 


= TT: 
- | S\ (—ByAn sin Ant + B#Ay COS Ans) sin a 
t=0 n=1 t=0 


= > BEA, sin 7 = g(x). 
n=1 


Hence we must choose the B;*’s so that for t = 0 the derivative du/dt becomes the Fourier 
sine series of g(x). Thus, again by (4) in Sec. 11.3, 


L 
BEA, = | g(x) sin de. 


0 


en) 


Since A, = cn7r/L, we obtain by division 


L 
2; . NIX 
(15) | g(x) sin | dx, n= 1,2,:°: 
0 


Result. Our discussion shows that u(x, f) given by (12) with coefficients (14) and (15) 
is a solution of (1) that satisfies all the conditions in (2) and (3), provided the series (12) 
converges and so do the series obtained by differentiating (12) twice termwise with respect 
to x and f and have the sums a7u/ ax” and a°u/ at”, respectively, which are continuous. 


Solution (12) Established. According to our derivation, the solution (12) is at first a 
purely formal expression, but we shall now establish it. For the sake of simplicity we 
consider only the case when the initial velocity g(x) is identically zero. Then the Bj, are 
zero, and (12) reduces to 


(16) u(x, t) = Sz, cos Apt sin a An = 


n=1 


It is possible to sum this series, that is, to write the result in a closed or finite form. For 
this purpose we use the formula [see (11), App. A3.1] 


cn. ni 1| . ni . nT 
cos t sin x= sin (x — ct)? + sin Tet ct)? |. 


L 


Consequently, we may write (16) in the form 
1< 1S 
u(x,t) = ~ SB, sin {Zo = an} +— SB, sin {Zo 4 oh, 
el - 20 L 


These two series are those obtained by substituting x — ct and x + ct, respectively, for 
the variable x in the Fourier sine series (13) for f(x). Thus 


(17) u(x, t) = 3[ f*(x — ct) + f*(x + ct)] 
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where f* is the odd periodic extension of f with the period 2L (Fig. 289). Since the initial 
deflection f(x) is continuous on the interval 0 = x S Land zero at the endpoints, it follows 
from (17) that u(x, ft) is a continuous function of both variables x and ¢ for all values of 
the variables. By differentiating (17) we see that u(x, f) is a solution of (1), provided f(x) 
is twice differentiable on the interval 0 < x < L, and has one-sided second derivatives at 
x = Oandx = L, which are zero. Under these conditions u (x, f) is established as a solution 
of (1), satisfying (2) and (3) with g(x) = 0. B 


1 - 


Fig. 289. Odd periodic extension of f(x) 


Generalized Solution. If f(x) and f"(x) are merely piecewise continuous (see Sec. 6.1), 
or if those one-sided derivatives are not zero, then for each ¢ there will be finitely many 
values of x at which the second derivatives of u appearing in (1) do not exist. Except at 
these points the wave equation will still be satisfied. We may then regard u(x, f) as a 
“generalized solution,” as it is called, that is, as a solution in a broader sense. For instance, 
a triangular initial deflection as in Example | (below) leads to a generalized solution. 


Physical Interpretation of the Solution (17). The graph of f* (x — cf) is obtained from 
the graph of f* (x) by shifting the latter ct units to the right (Fig. 290). This means that 
f* (x — ct)(c > 0) represents a wave that is traveling to the right as ¢ increases. Similarly, 


f*( + ct) represents a wave that is traveling to the left, and w(x, f) is the superposition 
of these two waves. 


f*(x) f*(x - ct) 


we A , 


Fig. 290. Interpretation of (17) 


Vibrating String if the Initial Deflection Is Triangular 


Find the solution of the wave equation (1) satisfying (2) and corresponding to the triangular initial deflection 


2k G 
ae if Se 


f(x) = 
2k -_ L 
—(L = x) if =x = 
L 2 
and initial velocity zero. (Figure 291 shows f(x) = u(x, 0) at the top.) 


Solution. Since g(x) = 0, we have B* = 0 in (12), and from Example 4 in Sec. 11.3 we see that the B, are 
given by (5), Sec. 11.3. Thus (12) takes the form 


| TC l , 397 37Tc 
z me ee ; t sin ae : t+ 


u(x, ft) 
= 
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For graphing the solution we may use u(x, 0) = f(x) and the above interpretation of the two functions in the 
representation (17). This leads to the graph shown in Fig. 291. | 


< 


ae 1 pe) 


Fig. 291. 
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al fs 
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Solution u(x, t) in Example 1 for various values of t (right part 


of the figure) obtained as the superposition of a wave traveling to the 
right (dashed) and a wave traveling to the left (left part of the figure) 


PROBLEM SET 12-3 


1. Frequency. How does the frequency of the fundamental 
mode of the vibrating string depend on the length of the 
string? On the mass per unit length? What happens if 
we double the tension? Why is a contrabass larger than 
a violin? 


2. Physical Assumptions. How would the motion of 
the string change if Assumption 3 were violated? 
Assumption 2? The second part of Assumption 1? The 
first part? Do we really need all these assumptions? 


3. String of length 77. Write down the derivation in this 
section for length L = 77, to see the very substantial 
simplification of formulas in this case that may show 
ideas more clearly. 


4. CAS PROJECT. Graphing Normal Modes. Write a 
program for graphing u,, with L = 7 and c? of your 
choice similarly as in Fig. 287. Apply the program to 
Ug, U3, Ug. Also graph these solutions as surfaces over 
the xt-plane. Explain the connection between these two 
kinds of graphs. 


DEFLECTION OF THE STRING 


Find u(x, t) for the string of length L = 1 and c2 = 1 when 
the initial velocity is zero and the initial deflection with small 
k (say, 0.01) is as follows. Sketch or graph u(x, f) as in 
Fig. 291 in the text. 

5. k sin 37rx 


6. k (sin 77x — 3 sin 277x) 
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7. kx(1 — x) 8. kx2(1 — x) 
9. 
0.1 
1 J 
0.5 1 
10. 


11. 


12. 1 


a 


1 


3 

a 

13. 2x — 4x7 if0<x<%, 0 if§<x<1 

14. Nonzero initial velocity. Find the deflection u(x, t) of 
the string of length L = 7 and c? = | for zero initial dis- 
placement and “triangular” initial velocity u,(x, 0) = 0.01x 
if OSx S37, u(x,0)=00l(7 —x) if 57S 
x S77. (Initial conditions with u,(x, 0) # O are hard 
to realize experimentally.) 


u y 


Fig. 292. Elastic beam 


15-20 


SEPARATION OF A FOURTH-ORDER 
PDE. VIBRATING BEAM 
By the principles used in modeling the string it can be 


shown that small free vertical vibrations of a uniform elastic 
beam (Fig. 292) are modeled by the fourth-order PDE 


(21) —+>=-c na (Ref. [C11]) 


where c? = EI/pA (E = Young’s modulus of elasticity, 
IZ = moment of intertia of the cross section with respect to the 


y-axis in the figure, p = density, A = cross-sectional 
area). (Bending of a beam under a load is discussed in 
Sec. 3.3.) 


15. Substituting u = F(x)G(f) into (21), show that 
F®/F = —G/c? G = p* = const, 
F(x) = Acos Bx + B sin Bx 
+ Ccosh Bx + D sinh Bx, 
G(t) = acos cB t + bsin cB" t. 


x 
a (A) Simply supported 


x=0 x=L 


(B) Clamped at both 
ends 


(C) Clamped at the left 
end, free at the 
right end 


Fig. 293. Supports of a beam 


16. Simply supported beam in Fig. 293A. Find solutions 
Un = Fy(x)G,(t) of (21) corresponding to zero initial 
velocity and satisfying the boundary conditions (see 
Fig. 293A) 


u(0, t) = 0, u(L, t) = 0 
(ends simply supported for all times 1), 
Ue (O, t) = 0, Uae (L, 1) = 0 
(zero moments, hence zero curvature, at the ends). 


17. Find the solution of (21) that satisfies the conditions in 
Prob. 16 as well as the initial condition 


u(x, 0) = f@) = x(L — x). 


18. Compare the results of Probs. 17 and 7. What is the 
basic difference between the frequencies of the normal 
modes of the vibrating string and the vibrating beam? 

19. Clamped beam in Fig. 293B. What are the boundary 
conditions for the clamped beam in Fig. 293B? Show 
that F in Prob. 15 satisfies these conditions if BL is a 
solution of the equation 


(22) cosh BL cos BL = 1. 


Determine approximate solutions of (22), for instance, 
graphically from the intersections of the curves of 
cos BL and 1/cosh BL. 
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20. Clamped-free beam in Fig. 293C. If the beam is Show that F in Prob. 15 satisfies these conditions if BL 
clamped at the left and free at the right (Fig. 293C), is a solution of the equation 
the boundary conditions are 
OH =O r=. (23) cosh BL cos BL = —1. 
Uy (L, t) = 0, Uxog(L, f) = 0. Find approximate solutions of (23). 


12.4 D’Alembert’s Solution 
of the Wave Equation. Characteristics 


It is interesting that the solution (17), Sec. 12.3, of the wave equation 


() 


Pu _ 2 du eee 
ara ax? P 


can be immediately obtained by transforming (1) in a suitable way, namely, by introducing 
the new independent variables 


(2) v=x+ct, Ww =x -— ct. 


Then u becomes a function of v and w. The derivatives in (1) can now be expressed in terms 
of derivatives with respect to v and w by the use of the chain rule in Sec. 9.6. Denoting 
partial derivatives by subscripts, we see from (2) that v, = 1 and wy, = 1. For simplicity 
let us denote u(x, f), as a function of v and w, by the same letter vu. Then 
Uy = UyVy + UpWy = Uy + Uy. 
We now apply the chain rule to the right side of this equation. We assume that all the 
partial derivatives involved are continuous, so that up» = Upw. Since Vv, = | andw, = 1, 
we obtain 
Ue = Uy + Up)e = Uy + Uyp)yVe + Uy + Uw) wWe = Uyy + 2Upw + Uww- 

Transforming the other derivative in (1) by the same procedure, we find 


_ 2 
Ute = Co (Uyy — QWyw + Uw). 


By inserting these two results in (1) we get (see footnote 2 in App. A3.2) 


a7u 
3 = = 
(3) Uow apn 


The point of the present method is that (3) can be readily solved by two successive 
integrations, first with respect to w and then with respect to v. This gives 


0 


“= hiv) and “= | neorae + Ww). 
dv 
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Here h(v) and &(w) are arbitrary functions of v and w, respectively. Since the integral is 
a function of v, say, @(v), the solution is of the form u = @(v) + W(w). In terms of x 
and ft, by (2), we thus have 


(4) Ce 1) = GHGS se Gi) se Uae — en), 


This is known as d’Alembert’s solution’ of the wave equation (1). 
Its derivation was much more elegant than the method in Sec. 12.3, but d’ Alembert’s method 
is special, whereas the use of Fourier series applies to various equations, as we shall see. 


D’Alembert’s Solution Satisfying the Initial Conditions 

(5) (a) u(x,0) =f), = (b)- uz (x, 0) = gQ). 
These are the same as (3) in Sec. 12.3. By differentiating (4) we have 
(6) uz(x, t) = cb'(x + ct) — c'(x — ct) 


where primes denote derivatives with respect to the entire arguments x + ct and x — ct, 
respectively, and the minus sign comes from the chain rule. From (4)-(6) we have 


(7) u(x, 0) = f(x) + w(x) = fo), 
(8) uz (x, 0) = cb'(x) + c"(x) = g(x). 


Dividing (8) by c and integrating with respect to x, we obtain 


1 x 
(9) f(x) — WO) = k(xo) + Al 8(s) ds, k(xo) = &(x0) — Wo). 


Xo 


If we add this to (7), then & drops out and division by 2 gives 
1 in 1 
(10) (x) = > f@) +5] gts) ds + = k(x). 
2 2c y 2 
Similarly, subtraction of (9) from (7) and division by 2 gives 
ap vo = Spey - 2 | (8) ds — + k(x) 
2 7 ai Sadi 


In (10) we replace x by x + ct; we then get an integral from x9 to x + ct. In (11) we 
replace x by x — ct and get minus an integral from x9 to x — ct or plus an integral from 
x — ct to xo. Hence addition of @(x + ct) and w(x — cf) gives u(x, f) [see (4)] in the form 


(are 


1 
(12) u(x, t) = al fe =P (2) ar jee = @A)l| ar + | g(s)ds. 
eG 


1JEAN LE ROND D’ ALEMBERT (1717-1783), French mathematician, also known for his important work 
in mechanics. 
We mention that the general theory of PDEs provides a systematic way for finding the transformation (2) 
that simplifies (1). See Ref. [C8] in App. 1. 
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If the initial velocity is zero, we see that this reduces to 
(13) u(x,t) = Lf + ct) + fx — ed), 


in agreement with (17) in Sec. 12.3. You may show that because of the boundary conditions 
(2) in that section the function f must be odd and must have the period 2L. 

Our result shows that the two initial conditions [the functions f(x) and g(x) in (5)] 
determine the solution uniquely. 

The solution of the wave equation by the Laplace transform method will be shown in 
Sec. 12.11. 


Characteristics. Types and Normal Forms of PDEs 


The idea of d’ Alembert’s solution is just a special instance of the method of characteristics. 
This concerns PDEs of the form 


(14) Aller tp 2BU py te Cura, > Ey nt, Us. Uy) 


(as well as PDEs in more than two variables). Equation (14) is called quasilinear because 
it is linear in the highest derivatives (but may be arbitrary otherwise). There are three 
types of PDEs (14), depending on the discriminant AC — B?. as follows. 


Type Defining Condition Example in Sec. 12.1 
Hyperbolic AC — B? <0 Wave equation (1) 
Parabolic AC — B*=0 Heat equation (2) 
Elliptic AC — B? >0 Laplace equation (3) 


Note that (1) and (2) in Sec. 12.1 involve ¢, but to have y as in (14), we set y = ct in 
(1), obtaining uy, — C7uz_ = C7(Uyy — Ux) = 0. And in (2) we set y = ct, so that 
Ut C igs = Cy — Ug). 

A, B, C may be functions of x, y, so that a PDE may be of mixed type, that is, of different 
type in different regions of the xy-plane. An important mixed-type PDE is the Tricomi 
equation (see Prob. 10). 


Transformation of (14) to Normal Form. The normal forms of (14) and the correspond- 
ing transformations depend on the type of the PDE. They are obtained by solving the 
characteristic equation of (14), which is the ODE 


(15) Ay’? — 2By’ + C=0 


where y’ = dy/dx (note —2B, not +2B). The solutions of (15) are called the characteristics 
of (14), and we write them in the form ® (x, y) = const and W(x, y) = const. Then the 
transformations giving new variables v, w instead of x, y and the normal forms of (14) are 
as follows. 
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Type New Variables Normal Form 
Hyperbolic v=@® w=V Uypw = Fy 
Parabolic v=xXx w=O=¥V Upw = Fo 
1 1 
Elliptic v= 5(® + W) w= 3°? —W) Uyy + Upw = F3 
i 
Here, © = P(x, y), V = Wx, y), Fy = Fi(v, w, u, Uy, Uw), etc., and we denote u as 
function of v, w again by u, for simplicity. We see that the normal form of a hyperbolic 
PDE is as in d’ Alembert’s solution. In the parabolic case we get just one family of solutions 
® = W. In the elliptic case, i= V—1, and the characteristics are complex and are of 
minor interest. For derivation, see Ref. [GenRef3] in App. 1. 
EXAMPLE 1. D’Alembert’s Solution Obtained Systematically 


The theory of characteristics gives d’Alembert’s solution in a systematic fashion. To see this, we write the wave 
equation uy, — Ce = 0 in the form (14) by setting y = ct. By the chain rule, uz, = uyyy = cuy and uy = CUyy. 
Division by 3 GIVES Ux — Uyy = 0, as stated before. Hence the characteristic equation is if? -1=0'+) 
(y’ — 1) = 0. The two families of solutions (characteristics) are ®(x, y) = y + x = const and V(x, y) = y —x= 


const. This gives the new variables v 
solution u = fy(x + ct) + fox — ct). 


y 


x and d’ Alembert’s 
ia] 


x=ctt+txandw= WV=y-—x=ct 


PROBLEM SET 12-4 


1, 


Show that c is the speed of each of the two waves given 
by (4). 


. Show that, because of the boundary conditions (2), Sec. 


12.3, the function fin (13) of this section must be odd 
and of period 2L. 


. If a steel wire 2 m in length weighs 0.9 nt (about 0.20 


Ib) and is stretched by a tensile force of 300 nt (about 
67.4 Ib), what is the corresponding speed of transverse 
waves? 


. What are the frequencies of the eigenfunctions in 


Prob. 3? 


5-8 


GRAPHING SOLUTIONS 


Using (13) sketch or graph a figure (similar to Fig. 291 in 
Sec. 12.3) of the deflection u(x, tf) of a vibrating string 
(length L = 1, ends fixed, c = 1) starting with initial 
velocity 0 and initial deflection (k small, say, k = 0.01). 


5. 
te 


F(x) = k sin 7x 
F(x) = k sin 27x 


6. f(x) = k(1 — cos 77x) 
8. f(x) = kx — x) 


NORMAL FORMS 


Find the type, transform to normal form, and solve. Show 
your work in detail. 


9. 


Uy + Auyy =0 10. Ux — l6uyy =0 


11. 
13. 
15. 
17. 
19. 


20. 


Ugy + Wgy + Uyy =O 12. Ugy — 2gy + Uyy = 0 
Ugy + Suzy + 4yy = 0 14. xuzy — Yuyy = 0 
Uy — YUxy = O 16. Uzy + 2Uyy + 10Uyy = 0 


Ugy — 4Ugzy + SuUyy =O 18. 
Longitudinal Vibrations of an Elastic Bar or Rod. 
These vibrations in the direction of the x-axis are 
modeled by the wave equation uz. = Cae, Co = E/p 
(see Tolstov [C9], p. 275). If the rod is fastened at one 
end, x = 0, and free at the other, x = L, we have 
u(0, t) = 0 and u,(L, t) = 0. Show that the motion 
corresponding to initial displacement u(x, 0) = f(x) 
and initial velocity zero is 


Uxy — OUzy + QUyy = O 


u= > Ap, SiN PyX COS PyCt, 
n=0 
ie (2n + 1) 
Ay == | f() sin pyx dx, Pn =. 
2L 


Tricomi and Airy equations.” Show that the Tricomi 
equation yy, + Uy, = 0 is of mixed type. Obtain the 
Airy equation G” — yG=0 from the Tricomi 
equation by separation. (For solutions, see p. 446 of 
Ref. [GenRef1] listed in App. 1.) 


Sir GEORGE BIDELL AIRY (1801-1892), English mathematician, known for his work in elasticity. FRANCESCO 
TRICOMI (1897-1978), Italian mathematician, who worked in integral equations and functional analysis. 
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12.5 Modeling: Heat Flow from a Body 
in Space. Heat Equation 


After the wave equation (Sec. 12.2) we now derive and discuss the next “big” PDE, the 
heat equation, which governs the temperature u in a body in space. We obtain this model 
of temperature distribution under the following. 


Physical Assumptions 
1. The specific heat o and the density p of the material of the body are constant. No 
heat is produced or disappears in the body. 


2. Experiments show that, in a body, heat flows in the direction of decreasing 
temperature, and the rate of flow is proportional to the gradient (cf. Sec. 9.7) of the 
temperature; that is, the velocity v of the heat flow in the body is of the form 


(1) v = —K grad u 
where u(x, y, z, ) is the temperature at a point (x, y, z) and time ¢. 
3. The thermal conductivity K is constant, as is the case for homogeneous material and 


nonextreme temperatures. 


Under these assumptions we can model heat flow as follows. 
Let T be a region in the body bounded by a surface S with outer unit normal vector n 
such that the divergence theorem (Sec. 10.7) applies. Then 


ven 


is the component of v in the direction of n. Hence |v * n AA| is the amount of heat leaving 
T (if ven > 0 at some point P) or entering T (if ven < 0 at P) per unit time at some 
point P of S through a small portion AS of S of area AA. Hence the total amount of heat 
that flows across S from T is given by the surface integral 


fae 


S 


Note that, so far, this parallels the derivation on fluid flow in Example 1 of Sec. 10.8. 
Using Gauss’s theorem (Sec. 10.7), we now convert our surface integral into a volume 
integral over the region T. Because of (1) this gives [use (3) in Sec. 9.8] 


(2) |] vemaa 


Ss 


-x|| (grad u) *ndA = -x|| Jaiv (grad u) dx dy dz 
Ss T 


-|| |v dx dy dz. 
T 


Here, 


is the Laplacian of wu. 
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On the other hand, the total amount of heat in T is 


H= [| Joon ar ay 
T 


with o and p as before. Hence the time rate of decrease of H is 


0H ou 
oF || oe = dx dy dz. 
T 


This must be equal to the amount of heat leaving T because no heat is produced or 
disappears in the body. From (2) we thus obtain 


-|[ [oo dx dy dz = -K ||| veuaedy ae 
T T 
[[[(Z= v%e) aeay ae = 0 os 
Ot op 
T 


Since this holds for any region T in the body, the integrand (if continuous) must be zero 
everywhere. That is, 


or (divide by —op) 


(3) = SC Vw — K/pa 
This is the heat equation, the fundamental PDE modeling heat flow. It gives the 
temperature u(x, y, z, t) in a body of homogeneous material in space. The constant c? is 
the thermal diffusivity. K is the thermal conductivity, o the specific heat, and p the density 
of the material of the body. Vu is the Laplacian of u and, with respect to the Cartesian 
coordinates x, y, Z, is 


The heat equation is also called the diffusion equation because it also models chemical 
diffusion processes of one substance or gas into another. 


12.6 Heat Equation: Solution by Fourier Series. 


Steady Two-Dimensional Heat Problems. 
Dirichlet Problem 


We want to solve the (one-dimensional) heat equation just developed in Sec. 12.5 and 
give several applications. This is followed much later in this section by an extension of 
the heat equation to two dimensions. 
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0 x=L 


Fig. 294. Bar under consideration 


As an important application of the heat equation, let us first consider the temperature 
in a long thin metal bar or wire of constant cross section and homogeneous material, which 
is oriented along the x-axis (Fig. 294) and is perfectly insulated laterally, so that heat flows 
in the x-direction only. Then besides time, u depends only on x, so that the Laplacian 
reduces tO Uz. = a°u/ ax”, and the heat equation becomes the one-dimensional heat 
equation 


ou 29 a7u 

” Ot : ax?" 
This PDE seems to differ only very little from the wave equation, which has a term uz 
instead of uz, but we shall see that this will make the solutions of (1) behave quite 
differently from those of the wave equation. 

We shall solve (1) for some important types of boundary and initial conditions. We 
begin with the case in which the ends x = 0 and x = L of the bar are kept at temperature 
zero, so that we have the boundary conditions 


(2) u(0, t) = 0, u(L, t) = 0 for allt = 0. 


Furthermore, the initial temperature in the bar at time t = 0 is given, say, f(x), so that we 
have the initial condition 


(3) u(x, 0) = f(x) L f(x) given]. 


Here we must have f(0) = 0 and f(L) = 0 because of (2). 

We shall determine a solution u(x, f) of (1) satisfying (2) and (3)—one initial condition 
will be enough, as opposed to two initial conditions for the wave equation. Technically, 
our method will parallel that for the wave equation in Sec. 12.3: a separation of variables, 
followed by the use of Fourier series. You may find a step-by-step comparison 
worthwhile. 


Step 1. Two ODEs from the heat equation (1). Substitution of a product u(x, 1) = 
F(x)G(t) into (1) gives FG = c2F"G with G = dG/dt and F" = d?F/dx?. To separate 
the variables, we divide by c7FG, obtaining 
G _ F" 
4 as 
@) eG F 
The left side depends only on ¢ and the right side only on x, so that both sides must equal 
a constant k (as in Sec. 12.3). You may show that for k = 0 or k > 0 the only solution 
u = FG satisfying (2) is u = 0. For negative k = —p? we have from (4) 
G F" P 

2a -  F 

c“G F 
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Multiplication by the denominators immediately gives the two ODEs 


(5) F" + p?F=0 
and 
(6) G + c*p’G = 0. 


Step 2. Satisfying the boundary conditions (2). We first solve (5). A general solution is 
(7) F(x) = A cos px + B sin px. 
From the boundary conditions (2) it follows that 

u(0, t) = F(O)G() = 0 and u(L, t) = F(L)G(t) = 0. 
Since G = 0 would give u = 0, we require F(0) = 0, F(Z) = 0 and get F(O) = A = 0 
by (7) and then F(L) = B sin pL = 0, with B # 0 (to avoid F = 0); thus, 


sin pL = 0, hence p= n=1,2,-°°. 


Setting B = 1, we thus obtain the following solutions of (5) satisfying (2): 


F,(x) = sin a i= Tod 


(As in Sec. 12.3, we need not consider negative integer values of n.) 
All this was literally the same as in Sec. 12.3. From now on it differs since (6) differs 
from (6) in Sec. 12.3. We now solve (6). For p = n7r/L, as just obtained, (6) becomes 


G+)2G=0 where A= ae 
It has the general solution 
Gal) = Bre, n=1,2,-" 
where B,, is a constant. Hence the functions 
(8) tin (%, 8) = Fyl@)Ga(@) = B,,sin = et (n= 1,2,-°) 


are solutions of the heat equation (1), satisfying (2). These are the eigenfunctions of the 
problem, corresponding to the eigenvalues A,, = cn7/L. 


Step 3. Solution of the entire problem. Fourier series. So far we have solutions (8) 
satisfying the boundary conditions (2). To obtain a solution that also satisfies the initial 
condition (3), we consider a series of these eigenfunctions, 


(9) u(x,t) = Sunt) = S)Bn sin ont (a, = om) 


n=1 n=1 
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EXAMPLE=1 


EXAMPLE 2 


From this and (3) we have 
u(x, 0) = SB, sin ae = f(x). 
n=1 


Hence for (9) to satisfy (3), the B,,’s must be the coefficients of the Fourier sine series, 
as given by (4) in Sec. 11.3; thus 


ib, 
» 5 OES - a 
(10) By, = al F(x) mo (n = 1, 2,-:-.) 


The solution of our problem can be established, assuming that f(x) is piecewise continuous 
(see Sec. 6.1) on the interval 0 = x S L and has one-sided derivatives (see Sec. 11.1) at all 
interior points of that interval; that is, under these assumptions the series (9) with coefficients 
(10) is the solution of our physical problem. A proof requires knowledge of uniform 
convergence and will be given at a later occasion (Probs. 19, 20 in Problem Set 15.5). 

Because of the exponential factor, all the terms in (9) approach zero as t approaches 
infinity. The rate of decay increases with n. 


Sinusoidal Initial Temperature 


Find the temperature u(x, f) in a laterally insulated copper bar 80 cm long if the initial temperature is 
100 sin (7x/80) °C and the ends are kept at 0°C. How long will it take for the maximum temperature in the bar 
to drop to 50°C? First guess, then calculate. Physical data for copper: density 8.92 g/ cm*, specific heat 
0.092 cal/(g °C), thermal conductivity 0.95 cal/(cm sec °C). 


Solution. The initial condition gives 


= _ NWTX : | WX 
u(x, 0) = > B, sin oa f(x) = 100 sin a0 


n=l 


Hence, by inspection or from (9), we get By = 100, Bp = Bz = -:- = 0. In (9) we need i= cq? /L?, where 
c? = K/(ap) = 0.95/(0.092 - 8.92) = 1.158 [cm?/sec]. Hence we obtain 


At = 1.158 - 9.870/807 = 0.001785 [sec™ 1]. 


The solution (9) is 
Xx 
u(x, t) = 100 sin — e70-001785t. 
80 


Also, 100e~9-001785* = 50 when t = (In 0.5)/(—0.001785) = 388 [sec] ~ 6.5 [min]. Does your guess, or at 
least its order of magnitude, agree with this result? 2] 
Speed of Decay 


Solve the problem in Example 1 when the initial temperature is 100 sin (377x/80) °C and the other data are as 
before. 


Solution. In (9), instead of n = 1 we now have n = 3, and 3 = 37yt = 9 - 0.001785 = 0.01607, so that 
the solution now is 


37x 
u(x, t) = 100 sin 267 0.01607. 
80 


Hence the maximum temperature drops to 50°C in t = (In 0.5)/(—0.01607) ~ 43 [sec], which is much faster 
(9 times as fast as in Example 1; why?). 
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Had we chosen a bigger n, the decay would have been still faster, and in a sum or series of such terms, each 
term has its own rate of decay, and terms with large n are practically 0 after a very short time. Our next example 
is of this type, and the curve in Fig. 295 corresponding to t = 0.5 looks almost like a sine curve; that is, it is 
practically the graph of the first term of the solution. i] 


Fig. 295. Example 3. Decrease of temperature 
with time t for L = 7 andc = 1 


“Triangular” Initial Temperature in a Bar 


Find the temperature in a laterally insulated bar of length L whose ends are kept at temperature 0, assuming that 
the initial temperature is 


x if 0<x<L/2, 
f@) = 
L-x if L/2<x<L. 


(The uppermost part of Fig. 295 shows this function for the special L = 77.) 


Solution. From (10) we get 


L/2 L 
2 _ NWTX . ATX 
(10*) By == (| x sin es dx + | (L — x) sin ; ax), 


Integration gives B,, = 0 if n is even, 


B = (n = 1,5,9, +++) d B ae (n = 3,7, 11,++*) 
- n= 1,5, 950% an == n= 3,1, 11,73"): 
. nea . na 


(see also Example 4 in Sec. 11.3 with k = L/2). Hence the solution is 


4y.| . x cm 1 |. 37x [ 3cm\? 
u(x, f) 3 | sin exp t 9 sin exp t} + 


T L L L 


Figure 295 shows that the temperature decreases with increasing t, because of the heat loss due to the cooling 
of the ends. 
Compare Fig. 295 and Fig. 291 in Sec. 12.3 and comment. 1] 
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EXAMPLE-4 


EXAMPLE 5 


Bar with Insulated Ends. Eigenvalue 0 
Find a solution formula of (1), (3) with (2) replaced by the condition that both ends of the bar are insulated. 


Solution. Physical experiments show that the rate of heat flow is proportional to the gradient of the 
temperature. Hence if the ends x = 0 and x = L of the bar are insulated, so that no heat can flow through the 
ends, we have grad u = uz, = du/dx and the boundary conditions 


(2*) u,(O, t) = 0, u,(L, t) = 0 for all ¢. 


Since u(x, t) = F(x)G(t), this gives u,,(0, t) = F’(0)G(t) = 0 and u,(L, t) = F'(L)G(t) = 0. Differentiating 
(7), we have F’(x) = —Ap sin px + Bp cos px, so that 


F'(0) = Bp =0 and then F'(L) = —Ap sin pL = 0. 


The second of these conditions gives p = py = n7/L, (n = 0, 1, 2, ---). From this and (7) with A = 1 and 
B = 0 we get F,,(x) = cos (n7x/L), (n = 0, 1, 2, ---). With G,, as before, this yields the eigenfunctions 


nNTTX —A2t 
(11) Un, t) = Fy(x)Gy(t) = An cos “ae " (n = 0, 1, +++) 


corresponding to the eigenvalues A, = cn7r/L. The latter are as before, but we now have the additional eigenvalue 
Ao = 0 and eigenfunction ug = const, which is the solution of the problem if the initial temperature f(x) is 
constant. This shows the remarkable fact that a separation constant can very well be zero, and zero can be an 
eigenvalue. 

Furthermore, whereas (8) gave a Fourier sine series, we now get from (11) a Fourier cosine series 


= A es a 
(12) UG en i DAncos =e (a _on ) 


n=0 n=0 


Its coefficients result from the initial condition (3), 


= nTTx 
u(x, 0) = >) An cos oa ff), 
n=0 


in the form (2), Sec. 11.3, that is, 


i lg 9p nNTTX 
(13) Ap =—| f(x) dx, Ay, =—| f(x) cos — dx, n=1,2,-°°. B 
LJy LJy L 


“Triangular” Initial Temperature in a Bar with Insulated Ends 


Find the temperature in the bar in Example 3, assuming that the ends are insulated (instead of being kept at 
temperature 0). 


Solution. For the triangular initial temperature, (13) gives Ag = L/4 and (see also Example 4 in Sec. 11.3 
with k = L/2) 


L/2 
nx nTXx 2L nT 
xX COS dx + (L — x) cos dx 2 cos cos ni — | }. 
L L 7 2 


Hence the solution (12) is 


L SL fl wx [ (2a), 1 bax [| (6cer\] 
u(x, t) 2 | 52 cos exp t| + cos exp t 
T be - L o 


We see that the terms decrease with increasing ¢, and u—>L/4 as t—>™; this is the mean value of the initial 
temperature. This is plausible because no heat can escape from this totally insulated bar. In contrast, the cooling 
of the ends in Example 3 led to heat loss and u — 0, the temperature at which the ends were kept. | 


+ L L L 
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Steady Two-Dimensional Heat Problems. 
Laplace’s Equation 


We shall now extend our discussion from one to two space dimensions and consider the 
two-dimensional heat equation 


a au a 
— CVU = 2 _ + :) 
ot Ox oy 


for steady (that is, time-independent) problems. Then du/dt = 0 and the heat equation 
reduces to Laplace’s equation 


(14) Vai) 


(which has already occurred in Sec. 10.8 and will be considered further in Secs. 
12.8-12.11). A heat problem then consists of this PDE to be considered in some region 
R of the xy-plane and a given boundary condition on the boundary curve C of R. This is 
a boundary value problem (BVP). One calls it: 


First BVP or Dirichlet Problem if u is prescribed on C (“Dirichlet boundary 
condition’) 


Second BVP or Neumann Problem if the normal derivative u, = du/dn is 
prescribed on C (“Neumann boundary condition’) 


Third BVP, Mixed BVP, or Robin Problem if u is prescribed on a portion of C 
and u,, on the rest of C (“Mixed boundary condition’). 


u =flx) 


w=0 Ad 


Fig. 296. Rectangle R and given boundary values 


Dirichlet Problem in a Rectangle R (Fig. 296). We consider a Dirichlet problem for 
Laplace’s equation (14) in a rectangle R, assuming that the temperature u(x, y) equals a 
given function f(x) on the upper side and 0 on the other three sides of the rectangle. 

We solve this problem by separating variables. Substituting u(x, y) = F(x)G()) into 
(14) written as Uz, = —Uyy, dividing by FG, and equating both sides to a negative 
constant, we obtain 
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From this we get 


2 
oF KF =0, 
dx? 


and the left and right boundary conditions imply 
F(O) = 0, and F(a) = 0. 


This gives k = (n7r/ a)” and corresponding nonzero solutions 


(15) F(x) = F(x) = sin” x, he ioe, 


The ODE for G with k = (n7/ a)" then becomes 


2 2 
oe () G=0. 
dy a 


Solutions are 
G(y) = Galy) = Ane”7/% + Bye "74. 


Now the boundary condition u = 0 on the lower side of R implies that G,,(0) = 0; that 
is, G,(0) = Ay + By, = 0 or By, = —Ay. This gives 


nw 
Gly) = Ap(er™4/? — e~ "74% = 2A,, sinh —, 


From this and (15), writing 2A, = A*, we obtain as the eigenfunctions of our problem 


a 


(16) UnlX, Y) = Fr(x)Gn(y) = An si 


These solutions satisfy the boundary condition u = 0 on the left, right, and lower sides. 
To get a solution also satisfying the boundary condition u(x, b) = f(x) on the upper 
side, we consider the infinite series 


u(x, y) = Sun y). 
From this and (16) with y = b we obtain 


u(x, b) = f(x) = Sat sin sinh 


n=1 


We can write this in the form 


= b 
u(x, b) = Dy (4% sinh wz sin am. 


n=1 
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This shows that the expressions in the parentheses must be the Fourier coefficients b,, of 
F(x); that is, by (4) in Sec. 11.3, 


a 
oe Saeed nib 2 . nix 
by, = Aj, sinh = 2 | F(x) sin a dx. 

0 


From this and (16) we see that the solution of our problem is 


(17) u(x, y) = > Hal y= Ai sin“ sinh “=> 
n=. n= 
where 
2 a 17. 
NITX 
18 Ax -—2___| x) sin —— dx. 
me) "~~ asinh (ntrb/a) J, Fe) a 


We have obtained this solution formally, neither considering convergence nor showing 
that the series for uv, uz, and uy, have the right sums. This can be proved if one assumes 
that f and f’ are continuous and f” is piecewise continuous on the interval 0 S x S a. 
The proof is somewhat involved and relies on uniform convergence. It can be found in 
[C4] listed in App. 1. 


Unifying Power of Methods. Electrostatics, Elasticity 


The Laplace equation (14) also governs the electrostatic potential of electrical charges in any 
region that is free of these charges. Thus our steady-state heat problem can also be interpreted 
as an electrostatic potential problem. Then (17), (18) is the potential in the rectangle R when 
the upper side of R is at potential f(x) and the other three sides are grounded. 

Actually, in the steady-state case, the two-dimensional wave equation (to be considered 
in Secs. 12.8, 12.9) also reduces to (14). Then (17), (18) is the displacement of a rectangular 
elastic membrane (rubber sheet, drumhead) that is fixed along its boundary, with three 
sides lying in the xy-plane and the fourth side given the displacement f(x). 

This is another impressive demonstration of the unifying power of mathematics. It 
illustrates that entirely different physical systems may have the same mathematical model 
and can thus be treated by the same mathematical methods. 


PROBLEM SET 12-6 


1. Decay. How does the rate of decay of (8) with fixed 3. Eigenfunctions. Sketch or graph and compare the first 
n depend on the specific heat, the density, and the three eigenfunctions (8) with B, = 1,c = 1, and 
thermal conductivity of the material? L=q fort = 0,0.1, 0.2,---, 1.0. 

2. Decay. If the first eigenfunction (8) of the bar 4. WRITING PROJECT. Wave and Heat Equations. 
decreases to half its value within 20 sec, what is the Compare these PDEs with respect to general behavior 


value of the diffusivity? of eigenfunctions and kind of boundary and initial 
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conditions. State the difference between Fig. 291 in 
Sec. 12.3 and Fig. 295. 


LATERALLY INSULATED BAR 

Find the temperature u(x, t) in a bar of silver of length 
10 cm and constant cross section of area 1 cm? (density 
10.6 g/cm, thermal conductivity 1.04 cal/(cm sec °C), 
specific heat 0.056 cal/(g °C) that is perfectly insulated 
laterally, with ends kept at temperature 0°C and initial 
temperature f(x) °C, where 

5. f(x) = sin 0.17rx 

6. f(x) = 4 — 0.8|x — 5| 

7. f(x) = x10 — x) 

8. Arbitrary temperatures at ends. If the ends x = 0 
and x = L of the bar in the text are kept at constant 
temperatures U, and Ub, respectively, what is the tem- 
perature u(x) in the bar after a long time (theoretically, 
as t—> ©)? First guess, then calculate. 


9. In Prob. 8 find the temperature at any time. 


10. Change of end temperatures. Assume that the ends 
of the bar in Probs. 5—7 have been kept at 100°C for a 
long time. Then at some instant, call it tf = 0, the 
temperature at x = L is suddenly changed to 0°C and 
kept at 0°C, whereas the temperature at x = 0 is kept 
at 100°C. Find the temperature in the middle of the bar 
at f = 1, 2, 3, 10, 50 sec. First guess, then calculate. 


BAR UNDER ADIABATIC CONDITIONS 


“Adiabatic” means no heat exchange with the neigh- 
borhood, because the bar is completely insulated, also at 
the ends. Physical Information: The heat flux at the ends 
is proportional to the value of du/dx there. 


11. Show that for the completely insulated bar, u,.(0, t) = 0, 
u,(L, t) = 0, u(x, t) = f(x) and separation of variables 
gives the following solution, with A, given by (2) in 
Sec. 11.3. 


as NIX 
u(x, 0) = Ag + > Ay, COS a een /Lyt 
n=1 
12-15} Find the temperature in Prob. 11 with L = 77, 
c = 1, and 
12. f(x) = x 13. f(x) = 1 


14. f(x) = cos 2x 15. f(x) = 1 — x/7 

16. A bar with heat generation of constant rate H ( > 0) 
is modeled by uz = CU, + H. Solve this problem if 
L = 7 and the ends of the bar are kept at 0°C. Hint. 
Set u = v — Hx(x — 1)/(2c’?). 

17. Heat flux. The heat flux of a solution u (x, t) across x = 0 
is defined by (ft) = —Ku,(0, t). Find (ft) for the 
solution (9). Explain the name. Is it physically under- 
standable that @ goes to 0 as t—> »? 
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18. Laplace equation. Find the potential in the rec- 
tangle 0 S x S 20,0 = y S 40 whose upper side is 
kept at potential 110 V and whose other sides are 
grounded. 

19. Find the potential in the square0 Sx S=2,05y32 
if the upper side is kept at the potential 1000 sin 5X 
and the other sides are grounded. 


20. CAS PROJECT. Isotherms. Find the steady-state 
solutions (temperatures) in the square plate in Fig. 297 
with a = 2 satisfying the following boundary condi- 
tions. Graph isotherms. 


(a) u = 80 sin 7x on the upper side, 0 on the others. 
(b) uw = Oon the vertical sides, assuming that the other 
sides are perfectly insulated. 

(c) Boundary conditions of your choice (such that the 
solution is not identically zero). 


a x 


Fig. 297. Square plate 


21. Heat flow in a plate. The faces of the thin square plate 
in Fig. 297 with side a = 24 are perfectly insulated. 
The upper side is kept at 25°C and the other sides are 
kept at 0°C. Find the steady-state temperature u(x, y) 
in the plate. 


22. Find the steady-state temperature in the plate in Prob. 
21 if the lower side is kept at Up°C, the upper side at 
U,°C, and the other sides are kept at 0°C. Hint: Split 
into two problems in which the boundary temperature 
is 0 on three sides for each problem. 


23. Mixed boundary value problem. Find the steady- 
state temperature in the plate in Prob. 21 with the upper 
and lower sides perfectly insulated, the left side kept 
at O°C, and the right side kept at f(y)°C. 

24. Radiation. Find steady-state temperatures in the 
rectangle in Fig. 296 with the upper and left sides 
perfectly insulated and the right side radiating into a 
medium at 0°C according to uy(a, y) + hu(a, y) = 0, 
h > 0 constant. (You will get many solutions since no 
condition on the lower side is given.) 


25. Find formulas similar to (17), (18) for the temperature 
in the rectangle R of the text when the lower side of R 
is kept at temperature f(x) and the other sides are kept 
at O°C. 
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12.7 Heat Equation: Modeling Very Long Bars. 
Solution by Fourier Integrals and 
Transforms 


Our discussion of the heat equation 


ou og au 
(1) Ot Ox 
in the last section extends to bars of infinite length, which are good models of very long 
bars or wires (such as a wire of length, say, 300 ft). Then the role of Fourier series in the 
solution process will be taken by Fourier integrals (Sec. 11.7). 

Let us illustrate the method by solving (1) for a bar that extends to infinity on both 
sides (and is laterally insulated as before). Then we do not have boundary conditions, but 
only the initial condition 


(2) u(x, 0) = f(x) (-S< 7%) 
where f(x) is the given initial temperature of the bar. 


To solve this problem, we start as in the last section, substituting u(x, 1) = F(x)G(t) 
into (1). This gives the two ODEs 


(3) F" + p?F=0 [see (5), Sec. 12.6] 
and 
(4) G + c*p*G = 0 [see (6), Sec. 12.6]. 


Solutions are 


2, 


F(x) = Acos px + B sin px and G(t) = ee Pt, 
respectively, where A and B are any constants. Hence a solution of (1) is 
(5) u(x, t; p) = FG = (A cos px + B sin px) e~°?*. 
Here we had to choose the separation constant k negative, k = —p”, because positive 


values of k would lead to an increasing exponential function in (5), which has no physical 
meaning. 


Use of Fourier Integrals 


Any series of functions (5), found in the usual manner by taking p as multiples of a fixed 
number, would lead to a function that is periodic in x when t = 0. However, since f(x) 
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in (2) is not assumed to be periodic, it is natural to use Fourier integrals instead of Fourier 
series. Also, A and B in (5) are arbitrary and we may regard them as functions of p, writing 
A = A(p) and B = B(P). Now, since the heat equation (1) is linear and homogeneous, 
the function 


00 


Oo | u(x, fp) dp = | [A(p) cos px + B(p) sin px] e~°?* dp 
0 0 


is then a solution of (1), provided this integral exists and can be differentiated twice with 
respect to x and once with respect to t¢. 


Determination of A(p) and B(p) from the Initial Condition. From (6) and (2) we get 


2) 


(7) u(x, 0) = | [A(p) cos px + B(p) sin px] dp = f(x). 
0 


This gives A(p) and B(p) in terms of f(x); indeed, from (4) in Sec. 11.7 we have 


i io : 
(8) A(p) = Al f(v) cos pu dv, B(p) = al Ff) sin pu dv. 
According to (1*), Sec. 11.9, our Fourier integral (7) with these A(p) and B(p) can be 
written 
1 oo eo 
u(x, 0) = =| | f(v) cos (px — pv) ao| dp. 
0 —0 


Similarly, (6) in this section becomes 


u(x, t) = 7 | | flv) cos (px — poem a | dp. 


0) —0 


Assuming that we may reverse the order of integration, we obtain 


(9) u(x, t) = = | feo| | e° Pt cos (px — pu) “| dv. 


=e 0 


Then we can evaluate the inner integral by using the formula 


2 


(10) | e-* cos 2bs ds = VE 
0 


[A derivation of (10) is given in Problem Set 16.4 (Team Project 24).] This takes the form 
of our inner integral if we choose p = s/(cVt) as a new variable of integration and set 


xX —U 


7 WcVt- 
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Then 2bs = (x — v)p and ds = cVtdp, so that (10) becomes 


io) — 2 
| e © * cos (px — pu) dp = V7 exp { Can \ 
0 2cVt Ac*t 


By inserting this result into (9) we obtain the representation 
emis Cao 
(11) u(x, t) = —_| f(v) exp {- ———S (010) 
LON Tia Act 


Taking z = (v — x)/ (2cVt) as a variable of integration, we get the alternative form 


(12) u(x, t) = ee (i fe + 2ezV1) e~* dz. 


If f(x) is bounded for all values of x and integrable in every finite interval, it can be 
shown (see Ref. [C10]) that the function (11) or (12) satisfies (1) and (2). Hence this 
function is the required solution in the present case. 


Temperature in an Infinite Bar 


Find the temperature in the infinite bar if the initial temperature is (Fig. 298) 


Fe) . =const if |x| <1, 
y= 
0 if |x| > 1. 


Fig. 298. Initial temperature in Example 1 


Solution. From (11) we have 


U (x — v)? 
u(x, t) = exp.) — dv. 
2cV Tt 4c?t 


If we introduce the above variable of integration z, then the integration over v from —1 to | corresponds to the 
integration over z from (—1 — x)/(2eV*2) to (1 — x)/(2cVt), and 


(1-a)/QeV't) 
0 
(13) u(x, t) = —— edz (t > 0). 
Vir “ 


-(1+a)/QceVt) 


We mention that this integral is not an elementary function, but can be expressed in terms of the error 
function, whose values have been tabulated. (Table A4 in App. 5 contains a few values; larger tables are 
listed in Ref. [GenRef1] in App. 1. See also CAS Project 1, p. 574.) Figure 299 shows u(x, t) for Up = 100°C, 
c= cm?/sec, and several values of t. o 
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EXAMPLE 2 


u(x, t) 


100 


Fig. 299. Solution u(x, t) in Example 1 for Uy = 100°C, 
c? = 1cm’/sec, and several values of t 


Use of Fourier Transforms 


The Fourier transform is closely related to the Fourier integral, from which we obtained the 
transform in Sec. 11.9. And the transition to the Fourier cosine and sine transform in Sec. 
11.8 was even simpler. (You may perhaps wish to review this before going on.) Hence it 
should not surprise you that we can use these transforms for solving our present or similar 
problems. The Fourier transform applies to problems concerning the entire axis, and the 
Fourier cosine and sine transforms to problems involving the positive half-axis. Let us explain 
these transform methods by typical applications that fit our present discussion. 


Temperature in the Infinite Bar in Example 1 
Solve Example | using the Fourier transform. 


Solution. The problem consists of the heat equation (1) and the initial condition (2), which in this example is 
f(x) = Up = const if |x| <1 and 0 otherwise. 


Our strategy is to take the Fourier transform with respect to x and then to solve the resulting ordinary DE in t. 
The details are as follows. 

Let u = #(u) denote the Fourier transform of u, regarded as a function of x. From (10) in Sec. 11.9 we see 
that the heat equation (1) gives 


Fut) = C’?F(Uyy) = C2(—w2)F(u) = —c2 wi. 


On the left, assuming that we may interchange the order of differentiation and integration, we have 


1 [* 1oaf* aa 
Fut) = | Were” dx = Van at | ue Y" dx = ar 
Thus 
ou 
on —c*wa 
ot 


Since this equation involves only a derivative with respect to ¢ but none with respect to w, this is a first-order 
ordinary DE, with t as the independent variable and w as a parameter. By separating variables (Sec. 1.3) we 
get the general solution 


2g 92 
aw, t) = C(wyeo ” * 
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with the arbitrary “constant” C(w) depending on the parameter w. The initial condition (2) yields the relationship 
u(w, 0) = C(w) = f(w) = #(f). Our intermediate result is 

ii(w, t) = fiwye owt, 


The inversion formula (7), Sec. 11.9, now gives the solution 


1 of" * owt deve 
(14) u(x, t) = —_| fiw) eo? Fe dw, 
V2 Jon 


In this solution we may insert the Fourier transform 


foi= sc | Fye®”av. 


7 


oo 


Assuming that we may invert the order of integration, we then obtain 


i : 
u(x, t) = =| fo | ee wrt gilwx— wy | dy, 
277 J_, 


2 


By the Euler formula (3). Sec. 11.9, the integrand of the inner integral equals 


ee - —c2wt - 
eo * cos (wx — wu) + ie ©” § sin (Wx — wu). 


We see that its imaginary part is an odd function of w, so that its integral is 0. (More precisely, this is the 


principal part of the integral; see Sec. 16.4.) The real part is an even function of w, so that its integral from —°% 
to © equals twice the integral from 0 to ~: 


1? eo 
u(x,t) = =| fo | eno wrt cos (wx — wv) dw | dv. 
= 0 


This agrees with (9) (with p = w) and leads to the further formulas (11) and (13). | 


Solution in Example 1 by the Method of Convolution 
Solve the heat problem in Example | by the method of convolution. 


Solution. The beginning is as in Example 2 and leads to (14), that is, 


(15) u(x, f) = = [. faye Ow tei dy, 


Now comes the crucial idea. We recognize that this is of the form (13) in Sec. 11.9, that is, 


(16) u(x, 1) = (f*g)(@) = | Sowygcwye"”* dw 
where 
(17) #00) = eu 
w) =e . 
s V2 


Since, by the definition of convolution [(11), Sec. 11.9], 


(18) (f¥g)@) = | S(p)g (x — p) dp, 


a 
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as our next and last step we must determine the inverse Fourier transform g of g. For this we can use formula 
9 in Table III of Sec. 11.10, 


Fe) = 1 eT aw 


with a suitable a. With c?t = 1/(4a) or a = 1/(4c?4), using (17) we obtain 
FeV /Ac"d) = V8 eH Ot = V2 V27rg(w). 
Hence g has the inverse 


1 = e 
é a7 /(4e7t) 


V 207 Var 


Replacing x with x — p and substituting this into (18) we finally have 


(19) ooo s— [x ) { enor 
u(x, t) = (f * g)(x) = p) exp ) — ————? dp. 
2cV Tt 4c7t 


2 


This solution formula of our problem agrees with (11). We wrote (f * g)(x), without indicating the parameter t 
with respect to which we did not integrate. ai] 


Fourier Sine Transform Applied to the Heat Equation 


If a laterally insulated bar extends from x = 0 to infinity, we can use the Fourier sine transform. We let the 
initial temperature be u(x, 0) = f(x) and impose the boundary condition u(0, t) = 0. Then from the heat equation 
and (9b) in Sec. 11.8, since f(0) = u(0, 0) = 0, we obtain 


Out 
F (uz) = a = OF (un) = —C?2w?F,(u) = —c?wfi,(w, 0). 
This is a first-order ODE dii,/dt + c’wii, = 0. Its solution is 
itg(w, t) = Clwye7? et, 
From the initial condition u(x, 0) = f(x) we have fi,(w, 0) = Aw) = C(w). Hence 


A a 2p? 
iis(w, t) = fwye* 7% 


Taking the inverse Fourier sine transform and substituting 


é 20° 
few) = Z| F(p) sin wp dp 
0 


on the right, we obtain the solution formula 
2 ta : —c?w"t |: 
(20) u(x,t) = = S(p) sin wp e sin wx dp dw. 
0 -0 


Figure 300 shows (20) with c = 1 for f(x) = 1 if 0 Sx = 1 and 0 otherwise, graphed over the xt-plane for 
0 Sx S2,0.01 StS 15. Note that the curves of u(x, t) for constant t resemble those in Fig. 299. BH 


574 


CHAP. 12 Partial Differential Equations (PDEs) 


Fig. 300. Solution (20) in Example 4 


PROBLEEM—SET 12-7 


1. CAS PROJECT. Heat Flow. (a) Graph the basic 


9. Graph the bell-shaped curve [the curve of the inte- 


Fig. 299. 


(b) In (a) apply animation to “see” the heat flow in 


terms of the decrease of temperature. 


(c) Graph u(x,t) with c = 1 as a surface over a 


rectangle of the form -a<x<a, O<y<b. 


2-8 | SOLUTION 
IN INTEGRAL FORM 


Using (6), obtain the solution of (1) in integral form 


satisfying the initial condition u(x, 0) = f(x), where 
2. f(x) = 1 if |x| < a and 0 otherwise 
3. f(x) = 1/1 + x”). 
Hint. Use (15) in Sec. 11.7. 
4. fx) =e 
5. f(x) = |x| if |x| < 1 and 0 otherwise 
6. f(x) =x if |x| < 1 and 0 otherwise 


7. f(x) = (sin x)/x. 
Hint. Use Prob. 4 in Sec. 11.7. 


8. Verify that wu in the solution of Prob. 7 satisfies the 


initial condition. 


9-12 CAS PROJECT. Error Function. 


2 x 
(21) erf x = =| e "” dw 
7 9 


This function is important in applied mathematics and 
physics (probability theory and statistics, thermodynamics, 
etc.) and fits our present discussion. Regarding it as a typical 
case of a special function defined by an integral that cannot 


be evaluated as in elementary calculus, do the following. 


10. 


11. 


12. 


13. 


14. 


15. 


grand in (21)]. Show that erf x is odd. Show that 


° ge Vit 
e dw = 3 ne — erf a). 


a 


b 
| en” dw = Varerf b. 
-b 
Obtain the Maclaurin series of erf x from that of the 
integrand. Use that series to compute a table of erf x 
for x = 0(0.01)3 (meaning x = 0, 0.01, 0.02,---, 3). 
Obtain the values required in Prob. 10 by an integration 
command of your CAS. Compare accuracy. 

It can be shown that erf (©) = 1. Confirm this experi- 
mentally by computing erf x for large x. 

Let f(x) = 1 when x > 0 and 0 when x < 0. Using 
erf (©) = 1, show that (12) then gives 


see 
edz 


u(x, t) = +_| 
Va a/(2cVt) 


1 1 x 
5 i ent ( 7) (t > 0). 


Express the temperature (13) in terms of the error 
function. 


a a 
Show that ®(x) = =| e > /? ds 
V2 


Here, the integral is the definition of the “distribution 
function of the normal probability distribution” to be 
discussed in Sec. 24.8. 
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12.8 Modeling: Membrane, 
Two-Dimensional Wave Equation 


Since the modeling here will be similar to that of Sec. 12.2, you may want to take another 
look at Sec. 12.2. 

The vibrating string in Sec. 12.2 is a basic one-dimensional vibrational problem. Equally 
important is its two-dimensional analog, namely, the motion of an elastic membrane, such 
as a drumhead, that is stretched and then fixed along its edge. Indeed, setting up the model 
will proceed almost as in Sec. 12.2. 


Physical Assumptions 


1. The mass of the membrane per unit area is constant (“homogeneous membrane’). 
The membrane is perfectly flexible and offers no resistance to bending. 


2. The membrane is stretched and then fixed along its entire boundary in the xy-plane. 
The tension per unit length T caused by stretching the membrane is the same at all 
points and in all directions and does not change during the motion. 


3. The deflection u(x, y, t) of the membrane during the motion is small compared to 
the size of the membrane, and all angles of inclination are small. 


Although these assumptions cannot be realized exactly, they hold relatively accurately for 
small transverse vibrations of a thin elastic membrane, so that we shall obtain a good 
model, for instance, of a drumhead. 


Derivation of the PDE of the Model (‘‘Two-Dimensional Wave Equation’’) from Forces. 
As in Sec. 12.2 the model will consist of a PDE and additional conditions. The PDE will be 
obtained by the same method as in Sec. 12.2, namely, by considering the forces acting on a 
small portion of the physical system, the membrane in Fig. 301 on the next page, as it is 
moving up and down. 

Since the deflections of the membrane and the angles of inclination are small, the sides 
of the portion are approximately equal to Ax and Ay. The tension T is the force per unit 
length. Hence the forces acting on the sides of the portion are approximately TAx and 
TAy. Since the membrane is perfectly flexible, these forces are tangent to the moving 
membrane at every instant. 


Horizontal Components of the Forces. We first consider the horizontal components 
of the forces. These components are obtained by multiplying the forces by the cosines of 
the angles of inclination. Since these angles are small, their cosines are close to 1. Hence 
the horizontal components of the forces at opposite sides are approximately equal. 
Therefore, the motion of the particles of the membrane in a horizontal direction will be 
negligibly small. From this we conclude that we may regard the motion of the membrane 
as transversal; that is, each particle moves vertically. 


Vertical Components of the Forces. These components along the right side and the 
left side are (Fig. 301), respectively, 


T Ay sin B and —T Ay sina. 


Here a and B are the values of the angle of inclination (which varies slightly along the 
edges) in the middle of the edges, and the minus sign appears because the force on the 
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Membrane 


TAy 


Fig. 301. Vibrating membrane 


left side is directed downward. Since the angles are small, we may replace their sines by 
their tangents. Hence the resultant of those two vertical components is 


T Ay(sin B — sina) ~ TAy(tan B — tan a) 


= TAy[uy(x a AX, y1) ~ Ux (X, ya)] 


(1) 


where subscripts x denote partial derivatives and y, and yp are values between y and 
y + Ay. Similarly, the resultant of the vertical components of the forces acting on the 
other two sides of the portion is 


(2) TAx[uy(%1,y + Ay) — Uy(X2, y)] 
where x1 and xg are values between x and x + Ax. 


Newton’s Second Law Gives the PDE of the Model. By Newton’s second law (see 
Sec. 2.4) the sum of the forces given by (1) and (2) is equal to the mass p AA of that 
small portion times the acceleration a°u/at?, here p is the mass of the undeflected 
membrane per unit area, and AA = Ax Ay is the area of that portion when it is unde- 
flected. Thus 


gz 

pAx Ay = TAyluy( + Ax, y1) ~ Hels 2) 
t 

+ TAx[uy(x1,y + Ay) — uy(%e, y)] 


where the derivative on the left is evaluated at some suitable point (x, y) corresponding 
to that portion. Division by pAx Ay gives 


SEC. 12.9 Rectangular Membrane. Double Fourier Series 577 


a7u T 
ar? ep 


u(x + Ax, y1) ~ Ua(% ya) , Multa y + Ay) ~ Uylxe, y) 
Ax Ay ; 


If we let Ax and Ay approach zero, we obtain the PDE of the model 


a a a T 
(3) etal ae “) eae 
ot Ox oy 


This PDE is called the two-dimensional wave equation. The expression in parentheses 
is the Laplacian A?u of u (Sec. 10.8). Hence (3) can be written 


(3’) 


Solutions of the wave equation (3) will be obtained and discussed in the next section. 


12.9 Rectangular Membrane. 
Double Fourier Series 


Now we develop a solution for the PDE obtained in Sec. 12.8. Details are as follows. 
The model of the vibrating membrane for obtaining the displacement u(x, y, f) of a point 
(x, y) of the membrane from rest (u = 0) at time f is 


r Ge) 
at axz ay 
(2) u = 0 on the boundary 
(3a) u(x, y, 0) = f% y) 
(3b) ut (x, y, 0) = g(x y). 


Here (1) is the two-dimensional wave equation with Ca T/p just derived, (2) is 
the boundary condition (membrane fixed along the boundary in the xy-plane for 
y all times t = 0), and (3) are the initial conditions at tf = 0, consisting of the given 
initial displacement (initial shape) f(x, y) and the given initial velocity g(x, y), where 
uz = du/dt. We see that these conditions are quite similar to those for the string in 


Sec. 12.2. 
a « Let us consider the rectangular membrane R in Fig. 302. This is our first important 
Fig. 302. model. It is much simpler than the circular drumhead, which will follow later. First we 
Rectangular note that the boundary in equation (2) is the rectangle in Fig. 302. We shall solve this 


membrane problem in three steps: 
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Step I. By separating variables, first setting u(x, y,t) = F(x, y)G(t) and later 
F(x, y) = H(x)Q(y) we obtain from (1) an ODE (4) for G and later from a PDE (5) for F 
two ODEs (6) and (7) for H and Q. 


Step 2. From the solutions of those ODEs we determine solutions (13) of (1) 
(“eigenfunctions” w,,,,) that satisfy the boundary condition (2). 


Step 3. We compose the w,,,, into a double series (14) solving the whole model (1), 


(2), (3). 


Step 1. Three ODEs From the Wave Equation (1) 


To obtain ODEs from (1), we apply two successive separations of variables. In the first 
separation we set u(x, y, t) = F(x, y)G(). Substitution into (1) gives 


FG = C?(FygG + FyyG) 


where subscripts denote partial derivatives and dots denote derivatives with respect to f. 
To separate the variables, we divide both sides by c?FG: 


G1 
2G ~ pfs + Fuy)- 


Since the left side depends only on ¢, whereas the right side is independent of t, both sides 
must equal a constant. By a simple investigation we see that only negative values of that 
constant will lead to solutions that satisfy (2) without being identically zero; this is similar 
to Sec. 12.3. Denoting that negative constant by —v", we have 


G _1 
CG = pea + Fyy) = —y’*, 


This gives two equations: for the “time function” G(t) we have the ODE 
(4) G+G=0 where A = cv, 


and for the “amplitude function” F (x, y) a PDE, called the two-dimensional Helmholtz* 
equation 


(5) [RE ioy ap od = O) 


7HERMANN VON HELMHOLTZ (1821-1894), German physicist, known for his fundamental work in 
thermodynamics, fluid flow, and acoustics. 
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Separation of the Helmholtz equation is achieved if we set F(x, y) = H(x)Q(y). By 
substitution of this into (5) we obtain 


2 2 
oT i (u£2 + F110) 
dy 


To separate the variables, we divide both sides by HQ, finding 


2 2 
1 d?H (£2 + .20)) 
H dx” Q \ dy” 


Both sides must equal a constant, by the usual argument. This constant must be negative, 
say, —k?, because only negative values will lead to solutions that satisfy (2) without being 
identically zero. Thus 


2 2 
1 d?H _ 1 (£2 + 129) = 2 
H dx” QO \ dy 


This yields two ODEs for H and Q, namely, 


2, 
Jel 
(6) aS or k?H =0 
dx 
and 
dz 
(7) a sF p70 =0 where p” = yp? — k?. 
dy 


Step 2. Satisfying the Boundary Condition 


General solutions of (6) and (7) are 
A(x) = A cos kx + B sin kx and Q(y) = Ccos py + D sin py 
with constant A, B,C, D. From u = FG and (2) it follows that F = HQ must be zero on 


the boundary, that is, on the edges x = 0,x = a, y = 0, y = b; see Fig. 302. This gives 
the conditions 


H(0) = 0, Ha) = 0, Q(0) = 0, Q(b) = 0. 


Hence H(0) = A = 0 and then H(a) = B sinka = 0. Here we must take B # 0 since 
otherwise H(x) = 0 and F(x, y) = 0. Hence sin ka = 0 or ka = m7r, that is, 


k= sali (m integer). 
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In precisely the same fashion we conclude that C = 0 and p must be restricted to the 
values p = n7r/b where n is an integer. We thus obtain the solutions H = Hy», Q = Qn, 
where 


7 = m=1,2,-°, 
. mmx _ on 
A,,(x) = sin and Q,(y) = sin = 

b n=1,2,-::. 
As in the case of the vibrating string, it is not necessary to consider m,n = —1, —2, --- 


since the corresponding solutions are essentially the same as for positive m and n, expect 
for a factor —1. Hence the functions 


m=1,2,°°-, 


‘ TX, 
(8) Finn @, Y) = Hm(x)Qn(y) = sin ” sin > _ 
a n=1,2,°-°-, 


are solutions of the Helmholtz equation (5) that are zero on the boundary of our membrane. 


Eigenfunctions and Eigenvalues. Having taken care of (5), we turn to (4). Since 
p” = v* — k? in (7) and A = cv in (4), we have 


A=H=cVR+ a 
Hence to k = m7r/a and p = n7r/b there corresponds the value 


2 2 m=1,2,°":, 
(9) A = Amn = CT = ap a 
a b n=1,2,°°°, 


in the ODE (4). A corresponding general solution of (4) is 
Ginn) = Bryn COS Amnt + Bein Sin Amnt- 
It follows that the functions uy y(x, y, 1) = Fimyn(x, y) Gmn(t), written out 


mmx . niry 
—— 


b 


(10) Dirples sh) =Car COS ara ae eta Sal Avy) SILO) 


with A», according to (9), are solutions of the wave equation (1) that are zero on 
the boundary of the rectangular membrane in Fig. 302. These functions are called the 
eigenfunctions or characteristic functions, and the numbers A), are called the 
eigenvalues or characteristic values of the vibrating membrane. The frequency of Um», 
is Ammn/277. 


Discussion of Eigenfunctions. It is very interesting that, depending on a and b, several 
functions F,,,, may correspond to the same eigenvalue. Physically this means that there 
may exists vibrations having the same frequency but entirely different nodal lines (curves 
of points on the membrane that do not move). Let us illustrate this with the following 
example. 
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EXAMPLE=1 


Eigenvalues and Eigenfunctions of the Square Membrane 


Consider the square membrane with a = b = 1. From (9) we obtain its eigenvalues 
(11) Amn = CTV m2 + 1n?. 
Hence Amn = Anm, but for m # n the corresponding functions 

Finn = Sin m7rx sin ntry and Frm = sin nix sin mtry 


are certainly different. For example, to Ay2 = Ag, = ca V5 there correspond the two functions 
Fy. = sin 7x sin 27ry and Fo, = sin 27x sin Try. 
Hence the corresponding solutions 
U2 = (By. cos cm V5t + Bro sinc V50)F 12 and U1 = (Bg cos cm V5t + Boy sin c7V51)F 21 


have the nodal lines y = 3 and x = 5. respectively (see Fig. 303). Taking Byy = 1 and Biz = BS, = 0, we 
obtain 


(12) yg + ugy = cos cm V5t (Fy2 + Boi F231) 


which represents another vibration corresponding to the eigenvalue c7r‘V5. The nodal line of this function is the 
solution of the equation 


Fig + Bo,Fo1 = sin 7x sin 27ry + Bog, sin 27x sin 7y = 0 
or, since sin 2a = 2 sina cosa, 
(13) sin 77x sin 7y(cos 7ry + Bg, cos 7x) = 0. 


This solution depends on the value of Bg; (see Fig. 304). 
From (11) we see that even more than two functions may correspond to the same numerical value of Amn. 
For example, the four functions Fyg, Fg1, F47, and F'74 correspond to the value 


18 Xg1 Naz A714 c7V65, because ? eu $7 =, 4? aR a = 65. 


This happens because 65 can be expressed as the sum of two squares of positive integers in several ways. 
According to a theorem by Gauss, this is the case for every sum of two squares among whose prime factors 
there are at least two different ones of the form 4n + | where n is a positive integer. In our case we have 


65=5-13=(44 112+ 1). 3 
uy Wig uo) 
: = i] 
—_ +-— 
I 
Uso Uys Us) 
Fig. 303. Nodal lines of the solutions Fig. 304. Nodal lines 
Uy, Uyy, Ud), Uz, U3, Uz, in the case of of the solution (12) for 


the square membrane some values of B3, 
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Step 3. Solution of the Model (1), (2), (3). 
Double Fourier Series 


So far we have solutions (10) satisfying (1) and (2) only. To obtain the solutions that also 
satisfies (3), we proceed as in Sec. 12.3. We consider the double series 


Hay O= SY duanGsy.D 


(14) m=l1n=1 
= > YG 6s Mane + Be saps ee Sint — 
m=l1n=1 


(without discussing convergence and uniqueness). From (14) and (3a), setting t = 0, we 
have 


cages _ mix , "my 
(15) U0) = SS Bae sin sin > = ff y). 


m=1n=1 


Suppose that f(x, y) can be represented by (15). (Sufficient for this is the continuity of 
f, of/ ax, af/dy, "f/x dy in R.) Then (15) is called the double Fourier series of f(x, y). 
Its coefficients can be determined as follows. Setting 


(16) Kn) = >) Bmn sin 
n=1 


we can write (15) in the form 
mx 


fey) = >) Km(y) sin] 


m=1 


For fixed y this is the Fourier sine series of f(x, y), considered as a function of x. From 
(4) in Sec. 11.3 we see that the coefficients of this expansion are 


a 
MITTX 


(17) Ky(y) = 2 | F(x, y) sin z dx. 
0 


Furthermore, (16) is the Fourier sine series of K,,(y), and from (4) in Sec. 11.3 it follows 
that the coefficients are 


b 
| Km(y) sin ee dy. 


2 

Bmn = = 

mn b 
0 


From this and (17) we obtain the generalized Euler formula 


b -a 
4 nT m=1,2,-:- 
(18) |e ae | | f(x, y) sin =* sin — dx dy 

0-0 e n= 1,2, *- 


for the Fourier coefficients of f(x, y) in the double Fourier series (15). 
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EXAMPLE 2 


The By in (14) are now determined in terms of f(x, y). To determine the B¥,,, we 
differentiate (14) termwise with respect to f; using (3b), we obtain 


oo co 


TT. TE 
= SS Bein Amn sin ———~ sin ~~~ = g(x,y). 
= a b 

t=0 m=1n=1 


ou 
at 


Suppose that g(x, y) can be developed in this double Fourier series. Then, proceeding as 
before, we find that the coefficients are 


1,2,°°° 


3 
II 


b -a 
(19) CE aree | | (x, y) sin sin 7 ax ay 
EDN 2 ‘ n=1,2,°"°. 


Result. Jf f and g in (3) are such that u can be represented by (14), then (14) with 
coefficients (18) and (19) is the solution of the model (1), (2), (3). 
Vibration of a Rectangular Membrane 


Find the vibrations of a rectangular membrane of sides a = 4 ft and b = 2 ft (Fig. 305) if the tension is 12.5 lb/ft, 
the density is 2.5 slugs/ft? (as for light rubber), the initial velocity is 0, and the initial displacement is 


(20) f(x,y) = 0.1(4x — x)Qy — y®) ft. 
y u 
y 
2 
4 
x ) si 
Membrane nitial displacement 


Fig. 305. Example 2 


Solution. c? = T/p = 12.5/2.5 = 5 [ft?/sec”]. Also Bx, = 0 from (19). From (18) and (20), 


a8 2 o .. mmx | nity 
Bun = > 0.1(4x — x*)(2y — y*) sin sin dx dy 
4-2), Jo 4 2 


a a. NIX : 2. ny 
(4x — x°) sin dx | (2y — y*) sin dy. 
20 Jy 4 6 2 


Two integrations by parts give for the first integral on the right 


128 256 
[b-(-1)"] (m odd) 
mq? mq3 
and for the second integral 
16 3 
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For even m or n we get 0. Together with the factor 1/20 we thus have B,,, = 0 if m or n is even and 


256-32 _ 0.426050 


20m3n37r6 mn? 


(m and n both odd). 


mn 


From this, (9), and (14) we obtain the answer 


u(x, y, t) = 0.426050) »} a 


mn odd Mn 


V5a 7 3 _ mmx | nity 
cos Vm + 4n* |} tsin sin 
4 4 


2 


VaT7V5 2 ax . TY 1 V5rV37 ox 37ry 
(21) = 0.426050 | cos t sin sin t cos t sin sin 
4 4 2 27 2 
1 Vaa7V13 | 3ax | TY 1 Vam7V45  3arx | 37ry ) 
+ cos t sin sin + cos tsin sin fe eres, Ihe 
27 4 2; 729 2 


To discuss this solution, we note that the first term is very similar to the initial shape of the membrane, has no 
nodal lines, and is by far the dominating term because the coefficients of the next terms are much smaller. The 
second term has two horizontal nodal lines (y = 2 4), the third term two vertical ones (x = 4, 8), the fourth 
term two horizontal and two vertical ones, and so on. |_| 


PROBLEM —SET 12-9 


1. Frequency. How does the frequency of the eigen- 
functions of the rectangular membrane change (a) If 
we double the tension? (b) If we take a membrane of 
half the density of the original one? (c) If we double 
the sides of the membrane? Give reasons. 

2. Assumptions. Which part of Assumption 2 cannot be 
satisfied exactly? Why did we also assume that the 
angles of inclination are small? 

3. Determine and sketch the nodal lines of the square 
membrane for m = 1, 2, 3, 4 and n = 1, 2, 3, 4. 


4-8| DOUBLE FOURIER SERIES 
Represent f(x, y) by a series (15), where 
4.f~y)=1, a=b=1 

5. faiy)=y, a=b=1 

6. fy) =x, a=b=1 

7. f(x, y) = xy, aand db arbitrary 


8. f(x, y) = xy(a — x)(b — y), aand b arbitrary 

9. CAS PROJECT. Double Fourier Series. (a) Write 
a program that gives and graphs partial sums of (15). 
Apply it to Probs. 5 and 6. Do the graphs show that 
those partial sums satisfy the boundary condition (3a)? 
Explain why. Why is the convergence rapid? 
(b) Do the tasks in (a) for Prob. 4. Graph a portion, 
say,O<x< 3, O0<y< 3, of several partial sums on 
common axes, so that you can see how they differ. (See 
Fig. 306.) 


(c) Do the tasks in (b) for functions of your choice. 


Fig. 306. Partial sums S,,5 and Sio10 
in CAS Project 9b 


10. CAS EXPERIMENT. Quadruples of F,,,,. Write a 
program that gives you four numerically equal Ayn» in 
Example 1, so that four different F,, correspond to it. 
Sketch the nodal lines of Fy, F'g1, F47, F74 in Example 
1 and similarly for further F',,, that you will find. 


11-13} SQUARE MEMBRANE 


Find the deflection u (x, y, t) of the square membrane of side 
a and c? = | for initial velocity 0 and initial deflection 


11. 0.1 sin 2x sin 4y 
12. 0.01 sin x sin y 


13. O.lay(a — x)(7 — y) 
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14-19} RECTANGULAR MEMBRANE 


14. 
15. 
16. 
17. 


18. 


Verify the discussion of (21) in Example 2. 

Do Prob. 3 for the membrane with a = 4 and b = 2. 
Verify By», in Example 2 by integration by parts. 
Find eigenvalues of the rectangular membrane of sides 
a = 2 and b = 1 to which there correspond two or 
more different (independent) eigenfunctions. 
Minimum property. Show that among all rectangular 
membranes of the same area A = ab and the same c 
the square membrane is that for which w, [see (10)] 
has the lowest frequency. 


19. 


20. 


Deflection. Find the deflection of the membrane of 
sides a and b with c? = 1 for the initial deflection 


67x 2 
FS, y) = sin sin a and initial velocity 0. 


Forced vibrations. Show that forced vibrations of a 
membrane are modeled by the PDEuy, = CVn + P/p, 
where P(x, y, f) is the external force per unit area acting 
perpendicular to the xy-plane. 


12.10 Laplacian in Polar Coordinates. 
Circular Membrane. 
Fourier—Bessel Series 


It is a general principle in boundary value problems for PDEs to choose coordinates that 
make the formula for the boundary as simple as possible. Here polar coordinates are used 
for this purpose as follows. Since we want to discuss circular membranes (drumheads), 
we first transform the Laplacian in the wave equation (1), Sec. 12.9, 


(1) Ure = C7V7u = C7 (Ure + Uyy) 


(subscripts denoting partial derivatives) into polar coordinates r, 6 defined by x = rcos 0, 
y = rsin 9; thus, 


P= V x2 + y?, 


iy 
tan@ =—. 
a 


By the chain rule (Sec. 9.6) we obtain 
Uy = Uply + UGOy.- 


Differentiating once more with respect to x and using the product rule and then again the 
chain rule gives 


Ugg = (Url) x mt (Up9x)a 
(2) = Upgly + Url ay + Ug)xOe + UpIax 


= (Url x: at UpG9q)lx + Uyl yx + (Uprlx + U9 99x) Ox, + UpO 2c. 


Also, by differentiation of r and @ we find 


x x 6 1 ( *) y 
i= Soy = a : 
PW? + y? r "1+ (y/x)* ae 
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Differentiating these two formulas again, we obtain 


P= arg 1 x. os 2 2xy 
Vex 3, 3 3> Dax: y 3 lx 4° 
r ror r r r 


We substitute all these expressions into (2). Assuming continuity of the first and second 
partial derivatives, we have u,g = ug,, and by simplifying, 


2 2 


2 
x xy y y xy 
(3) Une = > Ure 2 3 Ure + 4 Ugg + 3 uy + od Up. 


In a similar fashion it follows that 


2 2 2 


y xy x x 
(4) Uyy = > Urr + 2 3 re + 2 “00 + 
r r r 


By adding (3) and (4) we see that the Laplacian of u in polar coordinates is 


a7u i 1 ou 1 o2u 


5 Vu 
(5) “ ar? ror r? 06? 


Circular Membrane 


Circular membranes are important parts of drums, pumps, microphones, telephones, and 
other devices. This accounts for their great importance in engineering. Whenever a circular 
membrane is plane and its material is elastic, but offers no resistance to bending (this 
excludes thin metallic membranes!), its vibrations are modeled by the two-dimensional 
wave equation in polar coordinates obtained from (1) with Vu given by (5), that is, 


© a7u = (2 2 1 ou 1 a 2_T 
D: 


ar? ar? r or r2 ag" 


We shall consider a membrane of radius R (Fig. 307) and determine solutions u(r, f) 
that are radially symmetric. (Solutions also depending on the angle @ will be discussed in 
the problem set.) Then ugg = 0 in (6) and the model of the problem (the analog of (1), 
(2), (3) in Sec. 12.9) is 


‘ at ra) 
(8) u(R, t) = 0 for allt 2 0 
(9a) u(r, 0) = f(r) 
(9b) uz(r, 0) = g(r). 


Here (8) means that the membrane is fixed along the boundary circle r = R. The initial 
deflection f(r) and the initial velocity g(r) depend only on r, not on @, so that we can 
expect radially symmetric solutions u(r, f). 
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Step 1. Two ODEs From the Wave Equation (7). 
Bessel’s Equation 


Using the method of separation of variables, we first determine solutions u(r, t) = 
W(r)G(t). (We write W, not F because W depends on 7, whereas F, used before, depended 
on x.) Substituting wu = WG and its derivatives into (7) and dividing the result by c2WG, 
we get 


G1 ( rare ‘) 
sn HFTZwlwWt+-Ww 
PG W r 
where dots denote derivatives with respect to ¢ and primes denote derivatives with respect 
to r. The expressions on both sides must equal a constant. This constant must be negative, 


say, —k?, in order to obtain solutions that satisfy the boundary condition without being 
identically zero. Thus, 


This gives the two linear ODEs 


(10) GEYG-0 where A = ck 
and 
(11) Ww" + tw + k2w = 0. 


We can reduce (11) to Bessel’s equation (Sec. 5.4) if we set s = kr. Then 1/r = k/s and, 
retaining the notation W for simplicity, we obtain by the chain rule 


2 
_dW_dWds_ dW, gn OW 


ke. 
dr ds dr ds ds” 


w’ 


By substituting this into (11) and omitting the common factor k? we have 


2 
aw aw 


12 
oad ds” s ds 


W=0. 


This is Bessel’s equation (1), Sec. 5.4, with parameter v = 0. 


Step 2. Satisfying the Boundary Condition (8) 


Solutions of (12) are the Bessel functions Jp and ¥ of the first and second kind (see Secs. 
5.4, 5.5). But Yo becomes infinite at 0, so that we cannot use it because the deflection of 
the membrane must always remain finite. This leaves us with 


(13) Wr) = Jo(s) = Jo(kr) (s = kr). 
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On the boundary r = R we get W(R) = Jo(kKR) = 0 from (8) (because G = 0 would imply 
u = 0). We can satisfy this condition because Jo has (infinitely many) positive zeros, 
S = Q, Mg, --: (see Fig. 308), with numerical values 


ay = 2.4048, ag = 5.5201, ag = 8.6537, ag = 11.7915, as = 14.9309 


and so on. (For further values, consult your CAS or Ref. [GenRef1] in App. 1.) These 
zeros are slightly irregularly spaced, as we see. Equation (13) now implies 


Am 
(14) kR = Am thus k= Km = m=1,2,°°-. 
Hence the functions 
Am 
(15) Wralt) = Jo(kmr) = Jol =r), m= 1,2,-- 


are solutions of (11) that are zero on the boundary circle r = R. 


Eigenfunctions and Eigenvalues. For W,, in (15), a corresponding general solution of 
(10) with A = Ay = cky = Cy,/R is 


Giy(t) = Am COS Apt + By, Sin Amt. 


Hence the functions 
(16) Unt, 2) = Wr(NGm(t) = (Am cos Amt + Bry sin Amft)Jo(kmr) 


with m = 1, 2,--- are solutions of the wave equation (7) satisfying the boundary condition 
(8). These are the eigenfunctions of our problem. The corresponding eigenvalues are A,). 


The vibration of the membrane corresponding to uy, is called the mth normal mode; 
it has the frequency A,,,/277 cycles per unit time. Since the zeros of the Bessel function 
Jo are not regularly spaced on the axis (in contrast to the zeros of the sine functions 
appearing in the case of the vibrating string), the sound of a drum is entirely different 
from that of a violin. The forms of the normal modes can easily be obtained from Fig. 308 
and are shown in Fig. 309. For m = 1, all the points of the membrane move up (or down) 
at the same time. For m = 2, the situation is as follows. The function We(r) = Jo(aer/R) 
is zero for agr/R = ay, thus r = a1R/ag. The circle r = a 1R/qaz is, therefore, nodal line, 
and when at some instant the central part of the membrane moves up, the outer part 
(r > a ,R/az) moves down, and conversely. The solution u(r, f) has m — 1 nodal lines, 
which are circles (Fig. 309). 


Is) 
-10 -5 5 10 
Lg | re a. | 
-0, —<ar “aN ms a\_ “a, a. ~—~“a, ' 


Fig. 308. Bessel function J,(s) 
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PP aia 
Y 


m=l1 m=2 m=3 


Fig. 309. Normal modes of the circular membrane in the case of vibrations 
independent of the angle 


Step 3. Solution of the Entire Problem 


To obtain a solution u(r, tf) that also satisfies the initial conditions (9), we may proceed 
as in the case of the string. That is, we consider the series 


17) u(r, = > Wnl(AGm® = > Am 0s Amt + Bm Sin Amf)Jo (“,) 


m=1 m=1 


(leaving aside the problems of convergence and uniqueness). Setting t = O and using (9a), 
we obtain 


(18) u(r, 0) = > Ano Zr) = f(r). 
m=1 


Thus for the series (17) to satisfy the condition (9a), the constants A,, must be the 
coefficients of the Fourier—Bessel series (18) that represents f(r) in terms of Jo (am r/R); 
that is [see (9) in Sec. 11.6 with n = 0, @9,m = Gm, and x = rj, 


R 
2 Am 
(19) Am = P7a,) | rf(rJo (=) dr (m = 1,2, -:-). 


Differentiability of f(r) in the interval 0 =r = R is sufficient for the existence of the 
development (18); see Ref. [A13]. The coefficients B,,, in (17) can be determined from 
(9b) in a similar fashion. Numeric values of A,, and B,,, may be obtained from a CAS or 
by a numeric integration method, using tables of Jp and J;. However, numeric integration 
can sometimes be avoided, as the following example shows. 
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Vibrations of a Circular Membrane 


Find the vibrations of a circular drumhead of radius 1 ft and density 2 slugs/ft? if the tension is 8 lb/ft, the 
initial velocity is 0, and the initial displacement is 


ff) =1- r? [fel. 
Solution. c? = T/p = 8 =4 [ft?/sec?]. Also B,, = 0, since the initial velocity is 0. From (10) in Sec. 11.6, 
since R = 1, we obtain 

2 


Ji (Am) 


Am 


1 
| r= r2)Jo (Omi) dr 
0 


AJ. 2 (Q@m) 
arn] 1 (im) 
a 
ad af (Q@m) 


where the last equality follows from (21c), Sec. 5.4, with v = 1, that is, 


2 2 
Jo (Qm) Am, Jy (Qm) Jo (Qm) = Am, Ji (Qm)- 


Table 9.5 on p. 409 of [GenRefl] gives a, and Jo (a). From this we get Jy(@_,) = —Jo(am) by (21b), Sec. 5.4, 
with v = 0, and compute the coefficients Aj): 


m aan IG.) Jo(Qm) At 

1 2.40483 0.51915 0.43176 1.10801 
2 5.52008 —0.34026 —0.12328 —0.13978 
3 8.65373 0.27145 0.06274 0.04548 
4 11.79153 —0.23246 —0.03943 —0.02099 
5 14.93092 0.20655 0.02767 0.01164 
6 18.07106 —0.18773 —0.02078 —0.00722 
7 21.21164 0.17327 0.01634 0.00484 
8 24.35247 —0.16170 —0.01328 —0.00343 
9 27.49348 0.15218 0.01107 0.00253 
10 30.63461 —0.14417 —0.00941 —0.00193 


Thus 
f(r) = 1.108J9 (2.4048r) — 0.140J9 (5.5201r) + 0.045J9(8.6537r) — ---. 


We see that the coefficients decrease relatively slowly. The sum of the explicitly given coefficients in the table 
is 0.99915. The sum of all the coefficients should be 1. (Why?) Hence by the Leibniz test in App. A3.3 the 
partial sum of those terms gives about three correct decimals of the amplitude f(r). 

Since 


Am _ ckm = Cm/R = 2am, 
from (17) we thus obtain the solution (with r measured in feet and ft in seconds) 
t) = 1.108J9 (2.40487) cos 4.8097t — 0.140/9 (5.52017) cos 11.0402 + 0.045J9 (8.6537r) cos 17.3075t — -:: 


In Fig. 309, m = 1 gives an idea of the motion of the first term of our series, m = 2 of the second term, and 
m = 3 of the third term, so that we can “see” our result about as well as for a violin string in Sec. 12.3. iia 
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PROBLEM SET 12-10 


1-3} RADIAL SYMMETRY with arbitrary Ag and 
1. Why did we introduce polar coordinates in this I 7 
section? Ay = = | (0) cos né dé, 
71 2 
2. Radial symmetry reduces (5) to V?u = uy, + u,/r. si 
Derive this directly from V7u = Une + Uyy. Show B= I | f(0) sin nO dO. 
that the only solution of V?u = 0 depending only on " anR™ 1 i 


=i 


(e) Compatibility condition. Show that (9), Sec. 10.4, 
imposes on f(@) in (d) the “compatibility condition” 


r= Vx? 4 y? is u= alnr + b with arbitrary con- 
stants a and b. 
3. Alternative form of (5). Show that (5) can be written = 
Vu = (ruy),/r + teat, a form that is often practical. | (6) do = 0 
-7 


BOUNDARY VALUE PROBLEMS. SERIES a : 
(f) Neumann problem. Solve V“u = 0 in the annulus 


4, TEAM PROJECT. Series for Dirichlet and Neumann l<r<2if u,(1, 0) = sin 0, u,(2, 0) = 0. 
Problems 
(a) Show that u, = r”cos nd, u, = r™ sinn6,n = 0, 5-8| ELECTROSTATIC POTENTIAL. 
1,---, are solutions of Laplace’s equation Vu = 0 STEADY-STATE HEAT PROBLEMS 


with V7u given by (5). (What would u,, be in Cartesian 


: } ° The electrostatic potential satisfies Laplace’s equation 
coordinates? Experiment with small 7.) 


V7u = 0 in any region free of charges. Also the heat 
(b) Dirichlet problem (See Sec. 12.6) Assuming that equation u, = c?V2u (Sec. 12.5) reduces to Laplace’s 
termwise differentiation is permissible, show that a equation if the temperature u is time-independent 
solution of the Laplace equation in the disk r<R (“steady-state case”). Using (20), find the potential 
satisfying the boundary condition u(R, @) = f(@) (R and (equivalently: the steady-state temperature) in the disk 
f given) is r < 1if the boundary values are (sketch them, to see what 
is going on). 
. u(1, 0) = 220 if —dar < 6 < 377 and 0 otherwise 
. u(1, 0) = 400 cos? 6 
. ul, 0) = 110|6| if-7<0< 7 
(=) : .u(l,é)=¢6 if-57 <0<30 and 0 otherwise 

ar (2al| =) Sina) 

R . CAS EXPERIMENT. Equipotential Lines. Guess 

what the equipotential lines u(r, 8) = const in Probs. 5 


where dy, b, are the Fourier coefficients of f (see and 7 may look like. Then graph some of them, using 
Sec. 11.1). partial sums of the series. 


oo n 
u(r, 0) =agt+ > an(Z) cos n@ 
n=1 R 


(20) 


ne a ee | 


10. Semidisk. Find the electrostatic potential in the semi- 


(c) Dirichlet problem. Solve the Dirichlet problem 
disk r< 1, 0<60< 7 which equals 1100(7 — @) 


using (20) if R= 1 and the boundary values are 


u(0) = —100 volts if —7 < 6 <0, u(6) = 100 volts on the semicircle r= 1 and O on the segment 
if 0 < 6 < 7. (Sketch this disk, indicate the boundary -l<x<l. 
values.) 11. Semidisk. Find the steady-state temperature in a 


(d) Neumann problem. Show that the solution of the semicircular thin plate r<a,0<@<7 with the 


Neumann problem Veu = O0ifr<R uy (R, 0) = f(6) semicircle r = a kept at constant temperature ug and 
(where uy = du/0N is the directional derivative in the the segment —a < x <a at 0. 
direction of the outer normal) is 
CIRCULAR MEMBRANE 
utr, 8) = Ay + Sir"A, cosnb + By sin n6) 12. CAS PROJECT. Normal Modes. (a) Graph the 
n=l a - normal modes wa, U5, Ug as in Fig. 306. 
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15. 


16. 


17. 


18. 
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(b) Write a program for calculating the A,,’s in 
Example | and extend the table to m = 15. Verify 
numerically that a, ~ (m — 4)7 and compute the 
error for m = 1,---, 10. 


(c) Graph the initial deflection f(r) in Example 1 as 
well as the first three partial sums of the series. 
Comment on accuracy. 


(d) Compute the radii of the nodal lines of wg, ug, u4 
when R = 1. How do these values compare to those of 
the nodes of the vibrating string of length 1? Can you 
establish any empirical laws by experimentation with 
further u4,,? 


Frequency. What happens to the frequency of an 
eigenfunction of a drum if you double the tension? 


Size of a drum. A small drum should have a higher 
fundamental frequency than a large one, tension and 
density being the same. How does this follow from our 
formulas? 


Tension. Find a formula for the tension required 
to produce a desired fundamental frequency f; of a 
drum. 

Why is Ay + Ap + --- = 1 in Example 1? Compute 
the first few partial sums until you get 3-digit 
accuracy. What does this problem mean in the field 
of music? 

Nodal lines. Is it possible that for fixed c and R two 
or more uy, [see (16)] with different nodal lines 
correspond to the same eigenvalue? (Give a reason.) 
Nonzero initial velocity is more of theoretical interest 
because it is difficult to obtain experimentally. Show 
that for (17) to satisfy (9b) we must have 


R 
(21) ey = kn| rg (r)Jo(Q@mr/R) dr 
0 


where Ky, = 2/(camR)J{(Qm). 


VIBRATIONS OF A CIRCULAR MEMBRANE 
DEPENDING ON BOTH r AND 0 


19. 


(Separations) Show that substitution of u = F(r, 0)G(t) 
into the wave equation (6), that is, 

1 

ro, Uoe }> 


where A = ck, 


(22) 


1 
Utt (un t r uy t 
gives an ODE and a PDE 


(23) G+td2G=0, 


20 


21. 


22. 


1,-4t4F 
ro7. pe 8 


(24) Fass | k°F = 0. 


Show that the PDE can now be separated by sub- 
stituting F = W(r)Q(8), giving 


(25) Oo" +n?0 =0, 


(26) r2W" + rW! 4+ (kr? — n?)W = 0. 


Periodicity. Show that Q(6) must be periodic with 
period 277 and, therefore, n = 0, 1,2, --- in (25) and 
(26). Show that this yields the solutions Q,, = cos nd, 
O* = sin nd, W, = Jnkr),n = 0, 1,°-°. 


Boundary condition. Show that the boundary condition 


(27) u(R, 0, t) = 0 


leads tok = kin = Qmn/R, where s = aym is the mth 

positive zero of Jy,(s). 

Solutions depending on both r and @. Show that 

solutions of (22) satisfying (27) are (see Fig. 310) 
Unm = (Anm COS Cknmt + Bym sin ck ym) 


X In(Kknmr) cos nd 
(28) 


Us = (Axm COS CKymt + Bem sin ckymt) 


X Jn(Knmr) sin nd 


uy Uo) Uso 


Fig. 310. Nodal lines of some of the solutions (28) 


23. Initial condition. Show that u;,(r, 0,0) = 0 gives 


Bum = 9, Bim = 0 in (28). 


24. Show that v6, = 0 and wom is identical with (16) in 


this section. 


25. Semicircular membrane. Show that wu represents the 


fundamental mode of a semicircular membrane and 
find the corresponding frequency when c? = 1 and 
R=1. 
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12.11 Laplace’s Equation in Cylindrical and 
Spherical Coordinates. Potential 


One of the most important PDEs in physics and engineering applications is Laplace’s 
equation, given by 


(1) VG = by, + up, — 0 


Here, x, y, z are Cartesian coordinates in space (Fig. 167 in Sec. 9.1), Uay = a” u/ ax”, etc. 
The expression Vu is called the Laplacian of u. The theory of the solutions of (1) is 
called potential theory. Solutions of (1) that have continuous second partial derivatives 
are known as harmonic functions. 

Laplace’s equation occurs mainly in gravitation, electrostatics (see Theorem 3, Sec. 9.7), 
steady-state heat flow (Sec. 12.5), and fluid flow (to be discussed in Sec. 18.4). 

Recall from Sec. 9.7 that the gravitational potential u(x, y, z) at a point (x, y, z) resulting 
from a single mass located at a point (X, Y, Z) is 


(a 


Cc 
(2) u(x, y,2 == = 
7 Va -— xX? + (iy — + (eZ 


and u satisfies (1). Similarly, if mass is distributed in a region T in space with density 
p(X, Y, Z), its potential at a point (x, y, z) not occupied by mass is 


JJ 
(3) u(x, y,zZ) =k ; dX dY dZ. 


T 


(r > 0) 


It satisfies (1) because V2(1/n) = 0 (Sec. 9.7) and p is not a function of x, y, z. 

Practical problems involving Laplace’s equation are boundary value problems in a 
region T in space with boundary surface S. Such problems can be grouped into three types 
(see also Sec. 12.6 for the two-dimensional case): 


(I) First boundary value problem or Dirichlet problem if u is prescribed on S. 
(II) Second boundary value problem or Neumann problem if the normal 
derivative wu, = du/dn is prescribed on S. 
(II) Third or mixed boundary value problem or Robin problem if uv is prescribed 
on a portion of S and u,, on the remaining portion of S. 


In general, when we want to solve a boundary value problem, we have to first select 
the appropriate coordinates in which the boundary surface S has a simple representation. 
Here are some examples followed by some applications. 


Laplacian in Cylindrical Coordinates 


The first step in solving a boundary value problem is generally the introduction of 
coordinates in which the boundary surface S has a simple representation. Cylindrical 
symmetry (a cylinder as a region T) calls for cylindrical coordinates r, 0, z related to 
x, y, Z by 


(4) x =rcos@, y=rsin0, Z=2Z (Fig. 311). 
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I 
I 
I 
I 
I 
I 
I 
I 
N 


Fig. 311. Cylindrical coordinates Fig. 312. Spherical coordinates 
((20,0=6=2zn) ((20,050=27,0=9=7) 


For these we get Vu immediately by adding u,, to (5) in Sec. 12.10; thus, 


(5) a — 


Laplacian in Spherical Coordinates 


Spherical symmetry (a ball as region T bounded by a sphere S) requires spherical 
coordinates r, 0, ¢ related to x, y, z by 


(6) x = rcos@ sin ¢, y=rsin@ sin ¢, z=rcosqd (Fig. 312). 
Using the chain rule (as in Sec. 12.10), we obtain V7u in spherical coordinates 


eu . 2 Ou 1 ou i coth du Bs 1 a7u 
ror r* ad” r= ab sr? sin? 067" 


(7) Vu 


We leave the details as an exercise. It is sometimes practical to write (7) in the form 


2 
(7') wies|s (7%) 4 ee (sing 2) + aso | 
r° | or or sind dd ab sin” @ 00 


Remark on Notation. Equation (6) is used in calculus and extends the familiar notation 
for polar coordinates. Unfortunately, some books use 6 and ¢# interchanged, an extension 
of the notation x = rcos d, y = rsin@ for polar coordinates (used in some European 
countries). 


Boundary Value Problem in Spherical Coordinates 


We shall solve the following Dirichlet problem in spherical coordinates: 


sal a (2m) 1 (si *) = 
8) vee r2 Fe a ores ad ae : 


(9) u(R, pb) = fP) 


(10) lim u(r, $) = 0. 
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The PDE (8) follows from (7) or (7’) by assuming that the solution uw will not depend on 
0 because the Dirichlet condition (9) is independent of 0. This may be an electrostatic 
potential (or a temperature) f(@) at which the sphere S: r = R is kept. Condition (10) 
means that the potential at infinity will be zero. 


Separating Variables by substituting u(r, 6) = G(r)H(¢) into (8). Multiplying (8) by 
r?, making the substitution and then dividing by GH, we obtain 


1d ( 2 ic) 1 d ( i) 
r = sin d — }. 
G dr dr Hsing dd dob 


By the usual argument both sides must be equal to a constant k. Thus we get the two 
ODEs 


1 d A 
(11) La(2d8)_, oe Geo a 276 
G dr dr dr? dr 
and 
(12) a “(sino M2) + 14 = 0. 
sin db db dob 


The solutions of (11) will take a simple form if we set k = n(n + 1). Then, writing 
G' = dG/dr, etc., we obtain 


(13) r2G” + 2rG’ —n(n+ 1)G =0. 


This is an Euler—-Cauchy equation. From Sec. 2.5 we know that it has solutions G = r®. 
Substituting this and dropping the common factor r® gives 


a(a— 1) + 2a-—n(n+ 1) =0. The roots are a=n and -n-l, 
Hence solutions are 


1 


n+l" 
; 


(14) Gy(r) = r™ and Gir) = 


We now solve (12). Setting cos ¢ = w, we have sin? d=1- w2 and 


d_ddw__oig4 
dp dw dd dw’ 


Consequently, (12) with k = n(n + 1) takes the form 


(15) d C w?) | +nn+ 1H =0. 


This is Legendre’s equation (see Sec. 5.3), written out 
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H dH 
(15’) a= v9) 8 — aw ns nH =o. 
For integer n = 0, 1,--- the Legendre polynomials 
H = P,(w) = Py, (cos ) n= Q0,15%°%5 


are solutions of Legendre’s equation (15). We thus obtain the following two sequences 
of solution u = GH of Laplace’s equation (8), with constant A, and B,, where 
n=0,1,°°°, 


By, 
(16) (a) Un(r, 6) = Anr"P, (cos p), — (b)_ un(r, 6) = — 5 Pr (cos 6) 


jane 


Use of Fourier—Legendre Series 


Interior Problem: Potential Within the Sphere S._ We consider a series of terms from 
(16a), 


(17) u(r, b) = > Anr™P,(cos ) (r= R). 


n=0 


Since S is given by r = R, for (17) to satisfy the Dirichlet condition (9) on the sphere S, 
we must have 


(18) u(R, b) = >) AnR"P, (cos $) = fp); 


n=0 


that is, (18) must be the Fourier—Legendre series of {(@). From (7) in Sec. 5.8 we get 
the coefficients 


1 
2n + 1 ~ 
(19*) AnR” = | F(w) Pav) dw 
-1 
where f(w) denotes f(#) as a function of w = cos @. Since dw = —sin ¢ dd, and the limits 


of integration —1 and | correspond to @ = 7 and @ = 0, respectively, we also obtain 


antl 


19 Aa 
(19) re 


| se, (cos #) sin ¢ dd, n=0,1,-": 


If f(b) and f’ (b) are piecewise continuous on the interval 0 S ¢ S 7, then the series 
(17) with coefficients (19) solves our problem for points inside the sphere because it can 
be shown that under these continuity assumptions the series (17) with coefficients (19) 
gives the derivatives occurring in (8) by termwise differentiation, thus justifying our 
derivation. 
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EXAMPLE 1 


Exterior Problem: Potential Outside the Sphere S. Outside the sphere we cannot use 
the functions u,, in (16a) because they do not satisfy (10). But we can use the wi in (16b), 
which do satisfy (10) (but could not be used inside S; why?). Proceeding as before leads 
to the solution of the exterior problem 


--) 


(20) ur, o) = > 


n=0 


n 


af P, (cos @) (r2 R) 


n 


satisfying (8), (9), (10), with coefficients 


_ Fae I 


(21) Bn 5 


Rs | F()P,(cos ) sin d dé. 
0 


The next example illustrates all this for a sphere of radius | consisting of two hemispheres 
that are separated by a small strip of insulating material along the equator, so that these 
hemispheres can be kept at different potentials (110 V and 0 V). 


Spherical Capacitor 


Find the potential inside and outside a spherical capacitor consisting of two metallic hemispheres of radius 1 ft 
separated by a small slit for reasons of insulation, if the upper hemisphere is kept at 110 V and the lower is 
grounded (Fig. 313). 


Solution. The given boundary condition is (recall Fig. 312) 


0 if OS¢<7/2 


r= | 0 if w7/2<¢S8q. 


Since R = 1, we thus obtain from (19) 


2n + 1 a : 
Ay = i * 110 P,(cos @) sin d db 
0 


_ 2nt+1 


1 
- 110 | P,,(w) dw 

0 
where w = cos @. Hence P,,(cos ¢) sin 6 db = —P,(w) dw, we integrate from | to 0, and we finally get rid of 
the minus by integrating from 0 to 1. You can evaluate this integral by your CAS or continue by using (11) in 
Sec. 5.2, obtaining 


M 1 
(2n — 2m)! 
An = 55(2n + 1) 5) (-1)™ | w"2™ dy 
oa 2"m\(n — m)\(n — 2m)! Jo 


where M = n/2 for even n and M = (n — 1)/2 for odd n. The integral equals 1/(n — 2m + 1). Thus 


110 volts 


Fig. 313. Spherical capacitor in Example 1 
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EXAMPLE 2 
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55(2n+ 1) ¥ ye (2n — 2m)! 
” ) m\(n — m)\(n — 2m + I! 


(22) An 


m=0 
Taking n = 0, we get Ag = 55 (since 0! = 1). Forn = 1, 2,3, --+ we get 


165 2! 165 
2 = Of1!2! 2 


275 (4! 2! 
Ag 0, 
4 \0!2!3! NN! 


= ( 6! 4! ) 385 


Ay 


> 


3 . etc. 
8 \O0!3!4! 1!2!2! 8 
Hence the potential (17) inside the sphere is (since Py = 1) 
165 385 
(23) u(r, 6) = 55 + — r Py(cos d) — es r?P3(cos @) + ee: (Fig. 314) 


with P;, P3, «++ given by (11'), Sec. 5.21. Since R = 1, we see from (19) and (21) in this section that B, = An, 
and (20) thus gives the potential outside the sphere 

55 165 385 
(24) u(r, 6) = — + —~ Pi(cos hd) — —— Px(cos d) + +++. 

r 2r? 8r* 


Partial sums of these series can now be used for computing approximate values of the inner and outer potential. 


Also, it is interesting to see that far away from the sphere the potential is approximately that of a point charge, 
namely, 55/r. (Compare with Theorem 3 in Sec. 9.7.) 


110 


U IT t 


hla 


Fig. 314. Partial sums of the first 4, 6, and 11 nonzero terms of (23) for r = R = 1 


Simpler Cases. Help with Problems 


The technicalities encountered in cases that are similar to the one shown in Example | can often be avoided. 
For instance, find the potential inside the sphere S:r = R = | when S is kept at the potential f(@) = cos 2d. 
(Can you see the potential on S$? What is it at the North Pole? The equator? The South Pole?) 


Solution. w = cos ¢, cos 26 = 2 cos” d-1 2w2 — 1 $Po(w) 5 3 (aw? 3) 3. Hence the 
potential in the interior of the sphere is 


u= $r?P(w) = 3 = $r7Po(cos od) — 3 = 2723 cos” b =)= 2 |_| 


PROBLEM SET 12-71 


1. Spherical coordinates. Derive (7) from Vu in 3. Sketch P,(cos 0), 0 = 6 S 277, for n = 0,1, 2. (Use 
spherical coordinates. (11’) in Sec. 5.2.) 

2. Cylindrical coordinates. Verify (5) by transforming 4. Zero surfaces. Find the surfaces on which uy, ug, ug 
Vu back into Cartesian coordinates. in (16) are zero. 


SEC. 12.11 


5. CAS PROBLEM. Partial Sums. In Example 1 in the 
text verify the values of Ao, Aj, A2, Ag and compute 
Agq,***,A19. Try to find out graphically how well the 
corresponding partial sums of (23) approximate the 
given boundary function. 


6. CAS EXPERIMENT. Gibbs Phenomenon. Study the 
Gibbs phenomenon in Example 1 (Fig. 314) graphically. 


7. Verify that u,, and u*, in (16) are solutions of (8). 


8-15 | POTENTIALS DEPENDING ONLY ON r 


8. Dimension 3. Verify that the potential uw = c/r, r= 
Vx2 + y? + 2” satisfies Laplace’s equation in spherical 
coordinates. 


9. Spherical symmetry. Show that the only solution 
of Laplace’s equation depending only on r= 
Vx2 + y+ isu = c/r + k with constant c and k. 

10. Cylindrical symmetry. Show that the only solution of 
Laplace’s equation depending only on r = x2 + y? 
isu=clnr+k. 


11. Verification. Substituting u(r) with r as in Prob. 9 into 
Ure + Uyy + Uzz = 0, verify that uw” + 2u'/r = 0, in 
agreement with (7). 

12. Dirichlet problem. Find the electrostatic potential 
between coaxial cylinders of radii r; = 2 cm and 
rg = 4 cm kept at the potentials U; = 220 V and 
Uz = 140 V, respectively. 

13. Dirichlet problem. Find the electrostatic potential 
between two concentric spheres of radii r; = 2 cm 
and rg = 4 cm kept at the potentials U, = 220V and 
Uz = 140 V, respectively. Sketch and compare the 
equipotential lines in Probs. 12 and 13. Comment. 


14. Heat problem. If the surface of the ball r?= 
x2 + y? + 27 = R7 is kept at temperature zero and the 
initial temperature in the ball is f(r), show that the 
temperature u(r, ft) in the ball is a solution of u,z = 
C2(Upy + 2u,/r) satisfying the conditions u(R, ft) = 
0, u(r, 0) = f(r). Show that setting v =ru_ gives 
Ut = C7Urry v(R, t) = 0, u(r, 0) = rf(r). Include the 
condition v (0, t) = 0 (which holds because u must be 
bounded at r = 0), and solve the resulting problem by 
separating variables. 

15. What are the analogs of Probs. 12 and 13 in heat 
conduction? 


16-20 | BOUNDARY VALUE PROBLEMS 
IN SPHERICAL COORDINATES r, 6, d 

Find the potential in the interior of the sphere r = R = 1 
if the interior is free of charges and the potential on the 
sphere is 
16. f(¢) = cosh 17. f(¢) = 1 
18. f(¢) = 1 — cos? 19. f(d) = cos 26 
20. f(b) = 10 cos? b = 3 cos” & —S5cosd- 1 
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21. Point charge. Show that in Prob. 17 the potential exterior 
to the sphere is the same as that of a point charge at the 
origin. 

22. Exterior potential. Find the potentials exterior to the 
sphere in Probs. 16 and 19. 


23. Plane intersections. Sketch the intersections of the 
equipotential surfaces in Prob. 16 with xz-plane. 


24. TEAM PROJECT. Transmission Line and Related 
PDEs. Consider a long cable or telephone wire (Fig. 315) 
that is imperfectly insulated, so that leaks occur along the 
entire length of the cable. The source S of the current 
i(x, t) in the cable is at x = 0, the receiving end T at 
x = 1. The current flows from S to T and through the 
load, and returns to the ground. Let the constants R, L, 
C, and G denote the resistance, inductance, capacitance 
to ground, and conductance to ground, respectively, of 
the cable per unit length. 


Ss T 


| Load 


x=0 x=l 


Fig. 315. 


Transmission line 


(a) Show that (‘first transmission line equation’) 


_ _ pe 4 1 2! 


where u(x, f) is the potential in the cable. Hint: Apply 
Kirchhoff’s voltage law to a small portion of the cable 
between x and x + Ax (difference of the potentials at 
x and x + Ax = resistive drop + inductive drop). 
(b) Show that for the cable in (a) (“second transmis- 
sion line equation’), 

ue Gu+C - 

Ox ot 
Hint: Use Kirchhoff’s current law (difference of the 
currents at x and x + Ax = loss due to leakage to 
ground + capacitive loss). 


(c) Second-order PDEs. Show that elimination of i 
or u from the transmission line equations leads to 


Uy = LC, + (RC + GL)u, + RGu, 
inn = LCig, + (RC + GL)i, + RGi. 


(d) Telegraph equations. For a submarine cable, G 
is negligible and the frequencies are low. Show that 
this leads to the so-called submarine cable equations 
or telegraph equations 


Ugy = RCuz, Lee _ RCig. 
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Find the potential in a submarine cable with ends 
(x = 0, x = J) grounded and initial voltage distribution 
Up = const. 

(e) High-frequency line equations. Show that in the 
case of alternating currents of high frequencies the 
equations in (c) can be approximated by the so-called 
high-frequency line equations 


Uyye = LCuye, aye = LCigg. 


25. 


Solve the first of them, assuming that the initial 
potential is 


Up sin (77x/1), 
and uz(x, 0) = Oandu = Oatthe ends x = Oandx = / 
for all ¢. 


Reflection in a sphere. Let r,6,d be spherical 
coordinates. If u(r, 0, b) satisfies Vu = 0, show that 
v(r, 0, &) = u(1/r, 8, )/r satisfies V?v = 0. 


12.12 Solution of PDEs by Laplace Transforms 


Readers familiar with Chap. 6 may wonder whether Laplace transforms can also be used 
for solving partial differential equations. The answer is yes, particularly if one of the 
independent variables ranges over the positive axis. The steps to obtain a solution are 
similar to those in Chap. 6. For a PDE in two variables they are as follows. 


1. Take the Laplace transform with respect to one of the two variables, usually t. This 
gives an ODE for the transform of the unknown function. This is so since the 
derivatives of this function with respect to the other variable slip into the 
transformed equation. The latter also incorporates the given boundary and initial 


conditions. 


2. Solving that ODE, obtain the transform of the unknown function. 


3. Taking the inverse transform, obtain the solution of the given problem. 


If the coefficients of the given equation do not depend on f, the use of Laplace transforms 


will simplify the problem. 


We explain the method in terms of a typical example. 


EXAMPLE 1 Semi-Infinite String 


Find the displacement w (x, t) of an elastic string subject to the following conditions. (We write w since we need 


u to denote the unit step function.) 


(i) The string is initially at rest on the x-axis from x = 0 to © (“semi-infinite string”). 


(ii) For t > 0 the left end of the string (x = 0) is moved in a given fashion, namely, according to a single 


sine wave 


sint if0 StS27 


w0,) =f) = { (Fig. 316). 


(iii) Furthermore, Jim w(x, t) = 0 fort 2 0. 


f(t) 
1 


0 otherwise 


-l1 


Fig. 316. Motion of the left end of the string in Example 1 as a function of time t 
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Of course there is no infinite string, but our model describes a long string or rope (of negligible weight) with 
its right end fixed far out on the x-axis. 


Solution. We have to solve the wave equation (Sec. 12.2) 


a2. a2, 
ow ow T 
(1) = c2 > 2 = 


for positive x and t, subject to the “boundary conditions” 

(2) w(0, 1) = f(d, Jim w(x, t) = 0 (¢ = 0) 
with f as given above, and the initial conditions 

(3) (a) w(x, 0) = 0, (b) w(x, 0) = 0. 


We take the Laplace transform with respect to t. By (2) in Sec. 6.2, 


a" a 
se{ “ \ s*L{w} — sw(x, 0) — w;(x, 0) es ud \ : 
ot 


The expression —sw(x, 0) — w;(x, 0) drops out because of (3). On the right we assume that we may interchange 
integration and differentiation. Then 


a’w * eee a” [* a 
sf : \ = | en us dt | e “w(x, f) dt = — L£{w(a, D}. 
ax? 0 Ox 2 0 ax? 


Writing W(x, s) = L{w, 1}, we thus obtain 


aw ew ss? 


Since this equation contains only a derivative with respect to x, it may be regarded as an ordinary differential 
equation for W(x, s) considered as a function of x. A general solution is 


(4) Wx, s) = A(s)e%*/ + B(s)e*”°. 
From (2) we obtain, writing F(s) = L{f(O}, 
W(0, s) = L{w0,)} = LL fF} = FO). 
Assuming that we can interchange integration and taking the limit, we have 
iim W(x, s) = jim | ety (x, 1) dt = | ent jim w(x, t) dt = 0. 


0 0 


This implies A(s) = 0 in (4) because c > 0, so that for every fixed positive s the function e®*/° increases as x 


increases. Note that we may assume s > 0 since a Laplace transform generally exists for all s greater than some 
fixed k (Sec. 6.2). Hence we have 


W(0, s) = B(s) = F(s), 
so that (4) becomes 


WG, s) = Fie". 


From the second shifting theorem (Sec. 6.3) with a = x/c we obtain the inverse transform 


(5) wat) = 4 = *) u(: = *) (Fig. 317) 
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that is, 
. x in we x 
w(x, t) = sin Ce if o Sts ota or a> ze> (t— 2re 
and zero otherwise. This is a single sine wave traveling to the right with speed c. Note that a point x remains 
at rest until t = x/c, the time needed to reach that x if one starts at t = 0 (start of the motion of the left end) 


and travels with speed c. The result agrees with our physical intuition. Since we proceeded formally, we must 
verify that (5) satisfies the given conditions. We leave this to the student. ai] 


(t = 0) 


(t = 2n) 


(t = 47) 

x 
(t = 62) 

x 


Fig. 317. Traveling wave in Example 1 


We have reached the end of Chapter 12, in which we concentrated on the most important 
partial differential equations (PDEs) in physics and engineering. We have also reached 
the end of Part C on Fourier Analysis and PDEs. 


Outlook 


We have seen that PDEs underlie the modeling process of various important engineering 
application. Indeed, PDEs are the subject of many ongoing research projects. 

Numerics for PDEs follows in Secs. 21.4—21.7, which, by design for greater flexibility 
in teaching, are independent of the other sections in Part E on numerics. 

In the next part, that is, Part D on complex analysis, we turn to an area of a different 
nature that is also highly important to the engineer. The rich vein of examples and problems 
will signify this. It is of note that Part D includes another approach to the two-dimensional 
Laplace equation with applications, as shown in Chap. 18. 


PROBLEM SET 12-12 


1. Verify the solution in Example 1. What traveling wave dw dw 


adios tec =Oifx= 
do we obtain in Example 1 for a nonterminating ee ax 7 ot Mes WA) _— = 0, 
sinusoidal motion of the left end starting at t = 27r? w(0, t) = Oif 20 
2. Sketch a figure similar to Fig. 317 when c = 1 and ow Ow | _ - 
fx) is “triangular,” say, fx) = xifO<x<h,fe= Say 7 ap 77 VEIT wOD= | 


| = % if 3 <x < 1 and 0 otherwise. 


7. Solve Prob. 5 by separating variables. 


3. How does the speed of the wave in Example 1 of the 


i ing? a a a 
text depend on the tension and on the mass of the string? 8. “2 = 100 — + 100 m + 25, 
4-8| SOLVE BY LAPLACE TRANSFORMS . : 
w(x, 0) = Oifx 20, w(x, 0) = Oift 2 0, 
= gj ff 
4 eas, wi 0)= 1, w0,9=1 w(0, ft) = sintift2 0 
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9-12 | HEAT PROBLEM 


Find the temperature w(x,f) in a semi-infinite laterally 
insulated bar extending from x = 0 along the x-axis to 
infinity, assuming that the initial temperature is 0, w(x, t) > 0 
as x — © for every fixed t 2 0, and w(0, t) = f(t). Proceed 
as follows. 


9. Set up the model and show that the Laplace transform 
leads to 


(W = £{w}) 


and 
W= F(sje Y8V/° (F = L£{f}). 


10. Applying the convolution theorem, show that in Prob. 9, 


w(x, t) = 


x 
2¢V 1 


t 
[s0 - pyr 8 ee 81 4e ae 
0 


1. For what kinds of problems will modeling lead to an 
ODE? To a PDE? 

2. Mention some of the basic physical principles or laws 
that will give a PDE in modeling. 

3. State three or four of the most important PDEs and their 
main applications. 

4. What is “separating variables” in a PDE? When did we 
apply it twice in succession? 

5. What is d’Alembert’s solution method? To what PDE 
does it apply? 

6. What role did Fourier series play in this chapter? Fourier 
integrals? 

7. When and why did Legendre’s equation occur? Bessel’s 
equation? 

8. What are the eigenfunctions and their frequencies of the 
vibrating string? Of the vibrating membrane? 

9. What do you remember about types of PDEs? Normal 
forms? Why is this important? 

10. When did we use polar coordinates? Cylindrical coor- 
dinates? Spherical coordinates? 
11. Explain mathematically (not physically) why we got 

exponential functions in separating the heat equation, 
but not for the wave equation. 


12. Why and where did the error function occur? 
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11. Let w(0, t) = f(t) = u( (Sec. 6.3). Denote the corre- 
sponding w, W, and F by wo, Wo, and Fo. Show that 
then in Prob. 10, 


t 
| 73? et /c?r) dt 
2¢W 1 0 


= 1 -ert( e ) 
2cV1 


with the error function erf as defined in Problem Set 
12.7. 


12. Duhamel’s formula.* Show that in Prob. 11, 


Wo (x, t) = 


1 
Wo (x, 5) = me 


and the convolution theorem gives Duhamel’s formula 


t 


dWo 
W(x, t) = [10 — 7) —— dt. 
F OT 
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13. How do problems for the wave equation and the heat 
equation differ regarding additional conditions? 


14. Name and explain the three kinds of boundary conditions 
for Laplace’s equation. 


15. Explain how the Laplace transform applies to PDEs. 


16-18 Solve for u = u(x, y): 


16. Ug, + 25u = 0 
17. Uyy + uy — Ou = 18 
18. Uy + u(0, y) = f(y), 


ux (0, y) = gy) 


uy, = 0, 


19-21| NORMAL FORM 


Transform to normal form and solve: 
19. Uxy = Uyy 

20. Uxy + OUzy + QUyy = 0 

21. Uzy — 4uyy = 0 


22-24] VIBRATING STRING 


Find and sketch or graph (as in Fig. 288 in Sec. 12.3) the 
deflection u (x, t) of a vibrating string of length 77, extending 
from x =0 to x= 77, and c? = T/p = 4 starting with 
velocity zero and deflection: 
22. sin 4x 


24. dq — |x — dn 


23. sin? x 


4JEAN-MARIE CONSTANT DUHAMEL (1797-1872), French mathematician. 
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25-27 | HEAT 


Find the temperature distribution in a laterally insulated thin 
copper bar (c? = K/(op) = 1.158 cm?/ sec) of length 100 
cm and constant cross section with endpoints at x = 0 and 
100 kept at 0°C and initial temperature: 


25. sin 0.017x 26. 50 — |50 — x| 
27. sin? 0.01 7x 


28-30 | ADIABATIC CONDITIONS 


Find the temperature distribution in a laterally insulated 
bar of length a with c? = 1 for the adiabatic boundary 
condition (see Problem Set 12.6) and initial temperature: 


28. 3x? 29. 100 cos 2x 
30. 27 — 4|x — 37 

31-32 | TEMPERATURE IN A PLATE 
31. 


Let f(x, y) = u(x, y, 0) be the initial temperature in a 
thin square plate of side 77 with edges kept at 0°C and 
faces perfectly insulated. Separating variables, obtain 
from u, = c2V7u the solution 


°° 00 


u(xy,) = > > Bm sin mx sin ny ger tnt 
m=1n=1 
where 
4 T T 
Bmn = —> | | F(x, y) sin mx sin ny dx dy. 


SUMMARY-OF-CHAPTER-L2 
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32. Find the temperature in Prob. 31 if 
f(x,y) = x(a — x)y(ar — y). 


33-37 | MEMBRANES 


Show that the following membranes of area 1 with c? = 1 


have the frequencies of the fundamental mode as given 


(4-decimal values). Compare. 

33. Circle: a/(2V 7) = 0.6784 

34. Square: 1/\/2 = 0.7071 

35. Rectangle with sides 1:2:V5/8 = 0.7906 
36. Semicircle: 3.832/V87r = 0.7643 


37. Quadrant of circle: ag3/(4W 7) = 0.7244 
(a21 = 5.13562 = first positive zero of Jz) 


38-40 | ELECTROSTATIC POTENTIAL 


Find the potential in the following charge-free regions. 


38. Between two concentric spheres of radii rg and r; kept 


at potentials wo and wy, respectively. 
39, 


Compare with Prob. 38. 
40. 


usual spherical coordinates). 


Partial Differential Equations (PDEs) 


(1) wet = Cae 

(2) wee = Cure + Uy) 

(3) Up = Cae 

(4) Vu = Use + Uyy = 0 

(5) Vu = Uae + ttyy + Ue = 


Whereas ODEs (Chaps. 1-6) serve as models of problems involving only one 
independent variable, problems involving two or more independent variables (space 
variables or time f and one or several space variables) lead to PDEs. This accounts for 
the enormous importance of PDEs to the engineer and physicist. Most important are: 


One-dimensional wave equation (Secs. 12.2—12.4) 
Two-dimensional wave equation (Secs. 12.8—12.10) 
One-dimensional heat equation (Secs. 12.5, 12.6, 12.7) 
Two-dimensional Laplace equation (Secs. 12.6, 12.10) 


0 Three-dimensional Laplace equation 


(Sec. 12.11). 


Equations (1) and (2) are hyperbolic, (3) is parabolic, (4) and (5) are elliptic. 


Between two coaxial circular cylinders of radii rg and 
ry kept at the potentials uo and uy, respectively. 


In the interior of a sphere of radius 1 kept at the 
potential f(f) = cos3¢ + 3cos@ (referred to our 
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In practice, one is interested in obtaining the solution of such an equation in a 
given region satisfying given additional conditions, such as initial conditions 
(conditions at time t = 0) or boundary conditions (prescribed values of the solution 
u or some of its derivatives on the boundary surface S, or boundary curve C, of the 
region) or both. For (1) and (2) one prescribes two initial conditions (initial 
displacement and initial velocity). For (3) one prescribes the initial temperature 
distribution. For (4) and (5) one prescribes a boundary condition and calls the 
resulting problem a (see Sec. 12.6) 


Dirichlet problem if u is prescribed on S, 
Neumann problem if u,, = du/dn is prescribed on S, 
Mixed problem if u is prescribed on one part of S and u,, on the other. 


A general method for solving such problems is the method of separating 
variables or product method, in which one assumes solutions in the form of 
products of functions each depending on one variable only. Thus equation (1) is 
solved by setting u(x, t) = F(x)G(O; see Sec. 12.3; similarly for (3) (see Sec. 12.6). 
Substitution into the given equation yields ordinary differential equations for F and 
G, and from these one gets infinitely many solutions F = F,, and G = G,, such that 
the corresponding functions 


Un(X, t) = Frx)Gn(t) 


are solutions of the PDE satisfying the given boundary conditions. These are the 
eigenfunctions of the problem, and the corresponding eigenvalues determine the 
frequency of the vibration (or the rapidity of the decrease of temperature in the case 
of the heat equation, etc.). To satisfy also the initial condition (or conditions), one 
must consider infinite series of the u,,, whose coefficients turn out to be the Fourier 
coefficients of the functions f and g representing the given initial conditions (Secs. 
12.3, 12.6). Hence Fourier series (and Fourier integrals) are of basic importance 
here (Secs. 12.3, 12.6, 12.7, 12.9). 

Steady-state problems are problems in which the solution does not depend on 
time ¢. For these, the heat equation uz = c”Vu becomes the Laplace equation. 

Before solving an initial or boundary value problem, one often transforms the 
PDE into coordinates in which the boundary of the region considered is given by 
simple formulas. Thus in polar coordinates given by x = rcos 6, y = rsin 6, the 
Laplacian becomes (Sec. 12.11) 


1 1 
(6) V7 = Upp + — Uy + 3 Ue: 
r r 


for spherical coordinates see Sec. 12.10. If one now separates the variables, one gets 
Bessel’s equation from (2) and (6) (vibrating circular membrane, Sec. 12.10) and 
Legendre’s equation from (5) transformed into spherical coordinates (Sec. 12.11). 
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PART D 


~ | Complex 
a . 
™ Analysis 


Complex Numbers and Functions. Complex Differentiation 
Complex Integration 

Power Series, Taylor Series 

Laurent Series. Residue Integration 

Conformal Mapping 

Complex Analysis and Potential Theory 


Complex analysis has many applications in heat conduction, fluid flow, electrostatics, and 
in other areas. It extends the familiar “real calculus” to “complex calculus” by introducing 
complex numbers and functions. While many ideas carry over from calculus to complex 
analysis, there is a marked difference between the two. For example, analytic functions, 
which are the “good functions” (differentiable in some domain) of complex analysis, have 
derivatives of all orders. This is in contrast to calculus, where real-valued functions of 
real variables may have derivatives only up to a certain order. Thus, in certain ways, 
problems that are difficult to solve in real calculus may be much easier to solve in complex 
analysis. Complex analysis is important in applied mathematics for three main reasons: 

1. Two-dimensional potential problems can be modeled and solved by methods of 
analytic functions. This reason is the real and imaginary parts of analytic functions satisfy 
Laplace’s equation in two real variables. 

2. Many difficult integrals (real or complex) that appear in applications can be solved 
quite elegantly by complex integration. 

3. Most functions in engineering mathematics are analytic functions, and their study 
as functions of a complex variable leads to a deeper understanding of their properties and 
to interrelations in complex that have no analog in real calculus. 
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CHAPTER | 3 


Complex Numbers 
and Functions. Complex 
Differentiation 


The transition from “real calculus” to “complex calculus” starts with a discussion of 
complex numbers and their geometric representation in the complex plane. We then 
progress to analytic functions in Sec. 13.3. We desire functions to be analytic because 
these are the “useful functions” in the sense that they are differentiable in some domain 
and operations of complex analysis can be applied to them. The most important equations 
are therefore the Cauchy—Riemann equations in Sec. 13.4 because they allow a test of 
analyticity of such functions. Moreover, we show how the Cauchy—Riemann equations 
are related to the important Laplace equation. 

The remaining sections of the chapter are devoted to elementary complex functions 
(exponential, trigonometric, hyperbolic, and logarithmic functions). These generalize the 
familiar real functions of calculus. Detailed knowledge of them is an absolute necessity 
in practical work, just as that of their real counterparts is in calculus. 


Prerequisite: Elementary calculus. 
References and Answers to Problems: App. | Part D, App. 2. 


13.1 Complex Numbers and 
Their Geometric Representation 


The material in this section will most likely be familiar to the student and serve as a 
review. 

Equations without real solutions, such as x? = —-1 or x7 -— 10x + 40 = O, were 
observed early in history and led to the introduction of complex numbers.’ By definition, 
a complex number z is an ordered pair (x, y) of real numbers x and y, written 


Z=(,y). 


lFirst to use complex numbers for this purpose was the Italian mathematician GIROLAMO CARDANO 
(1501-1576), who found the formula for solving cubic equations. The term “complex number” was introduced 
by CARL FRIEDRICH GAUSS (see the footnote in Sec. 5.4), who also paved the way for a general use of 
complex numbers. 
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x is called the real part and y the imaginary part of z, written 
x = Rez, y = Imz. 
By definition, two complex numbers are equal if and only if their real parts are equal 
and their imaginary parts are equal. 


(0, 1) is called the imaginary unit and is denoted by i, 


(1) i = (0, 1). 


Addition, Multiplication. Notation z = x + iy 


Addition of two complex numbers z1 = (x1, y1) and zz = (X9, ya) is defined by 
(2) Z1 + 22 = (%1, 1) + (2; yo) = 1 + x2, yi + yo). 
Multiplication is defined by 

(3) Z1Z2 = (1, y1)(2, Yo) = 1X2 — yiye, Xiye + x21). 

These two definitions imply that 


(x4, 0) + (2, 0) = (41 + x2, 0) 
and 


(x1, 0)(X2, 0) = (x42, 0) 


as for real numbers x1, x2. Hence the complex numbers “extend” the real numbers. We 
can thus write 


(x, 0) = x. Similarly, (0, y) = iy 
because by (1), and the definition of multiplication, we have 
iy = (0, Dy = 0, D(y, 0) = (O-y— 1-0, 0-0+1-y) = (,y). 


Together we have, by addition, (x, y) = (x, 0) + (0, y) = x + iy. 
In practice, complex numbers z = (x, y) are written 


(4) To 
orz =x + yi, e.g., 17 + 47 (instead of i4). 
Electrical engineers often write j instead of i because they need i for the current. 


If x = 0, then z = iy and is called pure imaginary. Also, (1) and (3) give 


(5) i? =-1 


because, by the definition of multiplication, Pats (0, 1)(0, 1) = (-1, 0) = —1. 
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EXAMPLE 1 


EXAMPLE 2 


CHAP. 13. Complex Numbers and Functions. Complex Differentiation 


For addition the standard notation (4) gives [see (2)] 
(x1 + yy) + (XQ + ty) = (X1 + X2) + (1 + yo). 


For multiplication the standard notation gives the following very simple recipe. Multiply 
each term by each other term and use i 2 = —] when it occurs [see (3)]: 


. : . : 2 
(x1 + iyy)(%2q + iyg) = X1xXQ + IX yg t+ iyyXe + IY ye 


= (X4X2 — y1y2) + (X12 + X2y1). 


This agrees with (3). And it shows that x + iy is a more practical notation for complex 
numbers than (x, y). 

If you know vectors, you see that (2) is vector addition, whereas the multiplication (3) 
has no counterpart in the usual vector algebra. 


Real Part, Imaginary Part, Sum and Product of Complex Numbers 


Let z; = 8 + 3i and z2 = 9 — 2i. Then Re z; = 8, Imz, = 3, Re zp = 9, Imza 2 and 


zy +z =(8+3i/) + (9-2) =17 +3, 


zyze = (8 + 3i(9 — 2i) = 72 + 6 + i(—16 + 27) = 78 + Lili. | 


Subtraction, Division 


Subtraction and division are defined as the inverse operations of addition and multipli- 
cation, respectively. Thus the difference z = z1 — Zg is the complex number z for which 
Z1 =z + Ze. Hence by (2), 


(6) Ka — Hn = (Ch, = Sea) ae UD = Sa 


The quotient z = z1/z2 (z2 # 0) is the complex number z for which z, = zzg. If we 
equate the real and the imaginary parts on both sides of this equation, setting z = x + iy, 
we obtain x7 = xox — yay, yy = yox + xgy. The solution is 


24 : X4X2 + yiye X2y1 — X1y2 
(7) z= 7 xt iy, x= = — y= - - 
A x2 + y2 xo t+ y2 


The practical rule used to get this is by multiplying numerator and denominator of z1/z2 
by x2 — ive and simplifying: 


_ Xr t+ 1 1 + ty)G%2 — ye) X1%2 + Y1¥2 | | XaN1 — *1Y2 


ae a — 2p ge u are 
Xg t+ iyg (Xz + iye)(X2 — iy2) xo + yg x3 + yg 


(7) 


Difference and Quotient of Complex Numbers 


For z; = 8 + 3i and zg = 9 — 2i we get z1 — zo = (8 + 31) — (9 — 21) 1 + Si and 


Z1 8 +3) (8 +31/)9+ 21) 66+43i 66 43 
22. 9-21 (9 — 219 + 2i) 81 +4 85 85 


Check the division by multiplication to get 8 + 37. | 
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Complex numbers satisfy the same commutative, associative, and distributive laws as real 
numbers (see the problem set). 


Complex Plane 


So far we discussed the algebraic manipulation of complex numbers. Consider the 
geometric representation of complex numbers, which is of great practical importance. We 
choose two perpendicular coordinate axes, the horizontal x-axis, called the real axis, and 
the vertical y-axis, called the imaginary axis. On both axes we choose the same unit of 
length (Fig. 318). This is called a Cartesian coordinate system. 


(Imaginary y 
axis) 
y I 
| L 
PB . 2] x 
=x+ 
z=xtiy a ! 
1 | 
I I 
f (Real 
x axis) 3 4-31 
Fig. 318. The complex plane Fig. 319. The number 4 — 3/ in 


the complex plane 


We now plot a given complex number z = (x, y) = x + iy as the point P with coordinates 
x, y. The xy-plane in which the complex numbers are represented in this way is called the 
complex plane.” Figure 319 shows an example. 
Instead of saying “the point represented by z in the complex plane” we say briefly and 
simply “the point z in the complex plane.” This will cause no misunderstanding. 
Addition and subtraction can now be visualized as illustrated in Figs. 320 and 321. 


/ — 
g-- 
2, 


Fig. 320. Addition of complex numbers Fig. 321. Subtraction of complex numbers 


?Sometimes called the Argand diagram, after the French mathematician JEAN ROBERT ARGAND 
(1768-1822), born in Geneva and later librarian in Paris. His paper on the complex plane appeared in 1806, 
nine years after a similar memoir by the Norwegian mathematician CASPAR WESSEL (1745-1818), a surveyor 
of the Danish Academy of Science. 
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Complex Conjugate Numbers 
The complex conjugate z of a complex number z = x + iy is defined by 
La Keays 
It is obtained geometrically by reflecting the point z in the real axis. Figure 322 shows 


this for z = 5 + 2i and its conjugate z = 5 — 2i. 


z=x+ity=5+4 2i 


2 Z=n—ily=5—2i 


Fig. 322. Complex conjugate numbers 


The complex conjugate is important because it permits us to switch from complex 
to real. Indeed, by multiplication, zz = x7 + y? (verify!). By addition and subtraction, 
z+ z= 2x, z — z = 2iy. We thus obtain for the real part x and the imaginary part y 
(not iy!) of z = x + iy the important formulas 


1 
(8) Rez=x=3(z +2), a gO 


If z is real, z = x, then z = z by the definition of z, and conversely. Working with 
conjugates is easy, since we have 


(21 + 22) = % + Za, (21 — 22) = Z1 — Za, 
(9) — Z4 24 
= 7129, —)==—. 
(Z1Z2) 162 Zo Zo 


EXAMPLE 3 Illustration of (8) and (9) 


Let z} = 4 + 3i and zg = 2 + Si. Then by (8), 


Im z, ‘14 + 31) — (4 — 3i)] 
2i 


Also, the multiplication formula in (9) is verified by 


(Z1Z2) = (4 + 31(2 + 5i) = (—7 + 261) 7 — 260i, 


Z1Z2 = (4 — 3i)(2 — Si) 7 — 26i. Bo 
PR-OBEEM—SET 43-4 
1. Powers of i. Show that i2 = —1,i° = —i,i* = 1, this by graphing z and iz and the angle of rotation for 
i> = i,--- and 1/i = -i, 1/i? =-1, ii? =i--. z=1ltiz=—-1+21,7=4- 37. 


2. Rotation. Multiplication by i is geometrically a 3. Division. Verify the calculation in (7). Apply (7) to 
counterclockwise rotation through 7/2 (90°). Verify (26 — 18i)/(6 — 2i). 
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4. Law for conjugates. Verify (9) for z} = —11 + 10i, 
Zg=—-14+ 4. 

5. Pure imaginary number. Show that z = x + iy is 
pure imaginary if and only if z = —z. 

6. Multiplication. If the product of two complex numbers 
is zero, show that at least one factor must be zero. 

7. Laws of addition and multiplication. Derive the 
following laws for complex numbers from the cor- 
responding laws for real numbers. 


21 + Zq = 2a + 24,2122 = 2221 (Commutative laws) 


(zi 4 


Zi + (22 + Z3), 
(Associative laws) 
(Z1Z2)23 = Z1(Z2Z3) 


Z2) + 23 = 


Z1(Z2 + Z3) = 2122 + 2123 (Distributive law) 


0O+z=z+0=z 


z+(-2=C2a+2=90, Zs 
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8-15 | COMPLEX ARITHMETIC 
Let z3 = —2 + Ili, zz = 2 — i. Showing the details of 
your work, find, in the form x + iy: 

8. z1Z2,  (Z1Z2) 9. Re (z3), (Rez 4)" 


10. Re (1/z3), 


1/Re (z3) 


11. (zy — 22)7/16, (23/4 — 22/4)? 


12. z1/Z2, 


Zo/Z1 


13. (z1 + zo)(z1 — Za), Zi — 22 


14, %1/Za, 


(z1/Z2) 


15. 4 (z1 + z2)/(z1 — Z2) 


16-20 


of x and y: 


16. Im (1/2), Im (1/z?) 


18. Re [(1 


Let z = x + iy. Showing details, find, in terms 


17. Re z* — (Re z?)? 


+ iy'®27] 19. Re(z/Z, Im (z/Z) 


20. Im (1/2?) 


13.2 Polar Form of Complex Numbers. 


Powers and Roots 


We gain further insight into the arithmetic operations of complex numbers if, in addition 
to the xy-coordinates in the complex plane, we also employ the usual polar coordinates 


r, @ defined by 


(1) 


x = rcos 0, 


y=rsiné. 


We see that then z = x + iy takes the so-called polar form 


(2) 


z= r(cos@ + isin @). 


r is called the absolute value or modulus of z and is denoted by |z|. Hence 


(3) 


Geometrically, 


lz] =r= Vx2 + y2 = Vz. 


z| is the distance of the point z from the origin (Fig. 323). Similarly, 


|z1 — Za| is the distance between z, and zg (Fig. 324). 
@ is called the argument of z and is denoted by arg z. Thus 6 = arg z and (Fig. 323) 


(4) 


tan 0 zis 
EG 


(z # 0). 


Geometrically, 0 is the directed angle from the positive x-axis to OP in Fig. 323. Here, as 
in calculus, all angles are measured in radians and positive in the counterclockwise sense. 
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EXAMPLE 1 


Fig. 325. 


m/4 


Example 1 


CHAP. 13. Complex Numbers and Functions. Complex Differentiation 


For z = 0 this angle 6 is undefined. (Why?) For a given z # 0 it is determined only up 
to integer multiples of 277 since cosine and sine are periodic with period 277. But one 
often wants to specify a unique value of arg z of a given z # 0. For this reason one defines 
the principal value Arg z (with capital A!) of arg z by the double inequality 


(5) —7 <ArgezSq7. 


Then we have Arg z = 0 for positive real z = x, which is practical, and Arg z = 7 (not 
—7r!) for negative real z, e.g., for z = —4. The principal value (5) will be important in 
connection with roots, the complex logarithm (Sec. 13.7), and certain integrals. Obviously, 
for a given z # 0, the other values of arg z are arg z = Argz + 2n7 (n = +1, £2,::-). 


Imaginary 
axis 


P ; 
z=xtiy 
Real 
x axis 
Fig. 323. Complex plane, polar form Fig. 324. Distance between two 
of a complex number points in the complex plane 


Polar Form of Complex Numbers. Principal Value Arg z 


z= 1+ i (Fig. 325) has the polar form z = V2 (cos har P Psi ha). Hence we obtain 
lz| = V2, arg 7 = har + 2n7 (n = 0, 1,---), and Arg z= har (the principal value). 


Similarly, z = 3 + 3V3i = 6 (cos 47 + isin dar), |z| = 6, and Arg z = 4a. | 


CAUTION! In using (4), we must pay attention to the quadrant in which z lies, since 
tan 6 has period 77, so that the arguments of z and —z have the same tangent. Example: 
for 0, = arg (1 + i) and 62 = arg (—1 — i) we have tan 6; = tan 02 = 1. 


Triangle Inequality 


Inequalities such as x; < xg make sense for real numbers, but not in complex because there 
is no natural way of ordering complex numbers. However, inequalities between absolute values 
(which are real!), such as lz4| < lzol (meaning that z, is closer to the origin than zg) are of 
great importance. The daily bread of the complex analyst is the triangle inequality 


(6) Fe ae | = Faall 2 ka] (Fig. 326) 


which we shall use quite frequently. This inequality follows by noting that the three 
points 0, z1, and z; + zg are the vertices of a triangle (Fig. 326) with sides lzi|, zal, and 
|z1 + zg, and one side cannot exceed the sum of the other two sides. A formal proof is 
left to the reader (Prob. 33). (The triangle degenerates if z1 and zg lie on the same straight 
line through the origin.) 
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EXAMPLE 2 


Fig. 326. Triangle inequality 
By induction we obtain from (6) the generalized triangle inequality 
(6*) a tee = ley te lag) eee za 
that is, the absolute value of a sum cannot exceed the sum of the absolute values of the terms. 


Triangle Inequality 


If zy = 1 + iand zo = —2 + 31, then (sketch a figure!) 
ley + zal = |-1 + 4i| = V17 = 4.123 < V2 + V13 = 5.020. | 


Multiplication and Division in Polar Form 


This will give us a “geometrical” understanding of multiplication and division. Let 
Z1 = r,(cos 0, + isin 64) and Za = re(cos Og + isin A). 
Multiplication. By (3) in Sec. 13.1 the product is at first 
Z1Z2 = r1rgl(cos 61 cos 65 — sin 6; sin 09) + i(sin 64 cos A + cos 6, sin O9)]. 


The addition rules for the sine and cosine [(6) in App. A3.1] now yield 
(7) Z1Z2 = ryre[cos(6, + 65) + isin(0, + Oo). 


Taking absolute values on both sides of (7), we see that the absolute value of a product 
equals the product of the absolute values of the factors, 


(8) Iz1zal aa Izallzal. 


Taking arguments in (7) shows that the argument of a product equals the sum of the 
arguments of the factors, 


(9) arg (Z1Z2) = arg Zz; + arg Zo (up to multiples of 277). 


Division. We have z1 = (z1/Z2)z2. Hence |z1| = |(z1/z2)zel = |z1/zallze| and by 
division by \zo| 


* Izal 


Gil 
Izol 


(10) (zo # 0). 


£2 
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EXAMPLE 3 


EXAMPLE 4 


CHAP. 13. Complex Numbers and Functions. Complex Differentiation 
Similarly, arg z1 = arg [(z1/z2)za] = arg (z1/z2) + arg zg and by subtraction of arg zo 


& 
(11) arg = = arg Z1 — arg Zo (up to multiples of 277). 


Combining (10) and (11) we also have the analog of (7), 


£1 


al rae 
(12) = = s [cos (0,3 — 09) + isin (81 — 6@9)]. 


To comprehend this formula, note that it is the polar form of a complex number of absolute 
value r1/rse and argument 6; — 69. But these are the absolute value and argument of z1/ze, 
as we can see from (10), (11), and the polar forms of z1 and Zo. 

Illustration of Formulas (8)-(11) 


Let z1 = —2 + 2i and zg = 3i. Then z1zp = —6 — 6i, z1/z2 = 2 + yi. Hence (make a sketch) 


lzyzal = 6V2 = 3V8 = Izallze 


» — lz1/zel = 2V2/3 = lzil/Ize 


> 


and for the arguments we obtain Arg z1 = 3727/4, Arg zo = 77/2, 


307 Z1 7 
Arg (Z1Z2) Arg z1 + Arg zz — 277, Arg = Arg z1 — Arg Ze. a] 
Integer Powers of z. De Moivre’s Formula 
From (8) and (9) with z1 = zg = z we obtain by induction for n = 0, 1, 2,--- 
(13) z” =r” (cosn6 + isin nd). 


Similarly, (12) with z; = 1 and zy = 2” gives (13) forn = —1, —2,---. For |z| = r = 1, formula (13) becomes 
De Moivre’s formula* 


(13*) (cos 9 + isin 6)" = cos nO + isinné. 


We can use this to express cos n@ and sin né in terms of powers of cos @ and sin 6. For instance, for n = 2 we 
have on the left cos? @ + 2icos @ sin @ — sin? 6. Taking the real and imaginary parts on both sides of (13*) 
with n = 2 gives the familiar formulas 


cos 20 = cos” @ — sin” 6, sin 20 = 2 cos @ sin 0. 


This shows that complex methods often simplify the derivation of real formulas. Try n = 3. @ 


Roots 


If z= w" (n = 1, 2,---), then to each value of w there corresponds one value of z. We 
shall immediately see that, conversely, to a given z # 0 there correspond precisely n 
distinct values of w. Each of these values is called an nth root of z, and we write 


3 ABRAHAM DE MOIVRE (1667-1754), French mathematician, who pioneered the use of complex numbers 
in trigonometry and also contributed to probability theory (see Sec. 24.8). 
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(14) w= Vz. 


Hence this symbol is multivalued, namely, n-valued. The n values of Wz can be obtained 
as follows. We write z and w in polar form 


z = r(cos 6 + isin 8) and w = R(cos @ + isin @). 
Then the equation w” = z becomes, by De Moivre’s formula (with ¢ instead of 6), 
w” = R"(cosnd + isinnd) = z = r(cos 6 + isin 6). 
The absolute values on both sides must be equal; thus, R” = r, so that R = Vr, where 
V ris positive real (an absolute value must be nonnegative!) and thus uniquely determined. 


Equating the arguments n@ and @ and recalling that 6 is determined only up to integer 
multiples of 277, we obtain 


2k 
nd = 6 + 2km, thus ie oe 
n n 
where k is an integer. Fork = 0, 1,---, — 1 we get n distinct values of w. Further integers 


of k would give values already obtained. For instance, k = n gives 2k7r/n = 277, hence 
the w corresponding to k = 0, etc. Consequently, Wz, for z # 0, has the n distinct values 


a + 
(15) Wz = (cos u ae + isin ae 


where k = 0, 1,::-:,n — 1. These n values lie on a circle of radius Vr with center at the 
origin and constitute the vertices of a regular polygon of 1 sides. The value of Wz obtained 
by taking the principal value of arg z and k = 0 in (15) is called the principal value of 
w= Vz. 

Taking z = 1 in (15), we have |z| = r = | and Arg z = 0. Then (15) gives 


(16) WI = cos + isin =, k=0,1-+:,n-1. 


These n values are called the nth roots of unity. They lie on the circle of radius 1 and 
center 0, briefly called the unit circle (and used quite frequently!). Figures 327-329 show 


WI = 1,-5 + 50/31, WI = +1, +i, andy/1. 


Fig.327. W/1 Fig.328. W/1 Fig.329. W/1 
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If w denotes the value corresponding to k = 1 in (16), then the 7 values of W 1 can be 


written as 


More generally, if w; is any nth root of an arbitrary complex number z (# 0), then the n 


values of Wz in (15) are 


(17) W1, 


W110, 


because multiplying w; by wo” corresponds to increasing the argument of w; by 2k7r/n. 
Formula (17) motivates the introduction of roots of unity and shows their usefulness. 


1-8| POLAR FORM 

Represent in polar form and graph in the complex plane as 
in Fig. 325. Do these problems very carefully because polar 
forms will be needed frequently. Show the details. 


1 a ee 
1a 4, —5 
V2 + i/3 V3 — 10i 
ae 6. — 
A/$ = 3/3 Baa + Si 
vee => 193 
ie 2+ 5i 
9-14] PRINCIPAL ARGUMENT 


Determine the principal value of the argument and graph it 
as in Fig. 325. 


9 -1+i 10. —5, -5—-i, -5+i 
11.3241 12. -—7 -7i 

13. (1 + i)? 14. -14+0.1i, -1-0.1i 
15-18| CONVERSION TO x + iy 


Graph in the complex plane and represent in the form x + iy: 
15. 3 (cos 37T — isin 377) 16. 6 (cos 37 + isin dir) 
17. V8 (cos a7 + isin 47) 


18. V/50 (cos 3a + i sin 327) 


ROOTS 


19. CAS PROJECT. Roots of Unity and Their Graphs. 
Write a program for calculating these roots and for 
graphing them as points on the unit circle. Apply the 
program to z” = | withn = 2, 3,---, 10. Then extend 
the program to one for arbitrary roots, using an idea 
near the end of the text, and apply the program to 
examples of your choice. 


PROBLEM SET 13-2 


20. TEAM PROJECT. Square Root. (a) Show that 
w = Vz has the values 


W4= Vr | co 5+ isin} 


W2 = vefeos(5 t r) t isin( 2 t r)| 


= Wi. 


(18) 


(b) Obtain from (18) the often more practical formula 
(19) Vz= +[V4czl +x) + (sign yi Vz] + 2] 


where sign y = 1 if y = 0, sign y = —1 if y < 0, and 
all square roots of positive numbers are taken with 
positive sign. Hint: Use (10) in App. A3.1 withx = 6/2. 
(c) Find the square roots of —147, —9 — 40i, and 
1+ V48i by both (18) and (19) and comment on the 
work involved. 


(d) Do some further examples of your own and apply 
a method of checking your results. 


ROOTS 


Find and graph all roots in the complex plane. 


21. W1i+i 22. W344i 
23. W/216 = 24. W—4 


25. Wi 26. V1 27. W-1 
28-31] EQUATIONS 

Solve and graph the solutions. Show details. 
28. 22 — (6 — 2i)z +17 - 61 =0 

29. 27 +z24+1-7=0 


30. <4 + 324 = 0. Using the solutions, factor z* + 324 
into quadratic factors with real coefficients. 


31. c* — 6iz2 + 16 =0 
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32-35 | INEQUALITIES AND EQUALITY 34. Re and Im. Prove [Re z| = |zl, [Im z| S |zl. 
32. Triangle inequality. Verify (6) for z1=3+ 7%, 35, Parallelogram equality. Prove and explain the name 
z2 > —2+ 4i ‘ ; ‘ : 
33. Triangle inequality. Prove (6). Iza + zel” + [za — zal” = 2 (lzal* + Izel. 


13.3 Derivative. Analytic Function 


Just as the study of calculus or real analysis required concepts such as domain, 
neighborhood, function, limit, continuity, derivative, etc., so does the study of complex 
analysis. Since the functions live in the complex plane, the concepts are slightly more 
difficult or different from those in real analysis. This section can be seen as a reference 
section where many of the concepts needed for the rest of Part D are introduced. 


Circles and Disks. Half-Planes 


The unit circle |z| = 1 (Fig. 330) has already occurred in Sec. 13.2. Figure 331 shows a 
general circle of radius p and center a. Its equation is 


Iz—al=p 
y 
y es 
y ai i 
yo Po x 
t a \ 
. i / e. , 
ie O | 1 1 
\ q a / | 
\ = ] 
L. oes d 
. ul 
x ee x 
Fig. 330. Unit circle Fig. 331. Circle in the Fig. 332. Annulus in the 
complex plane complex plane 


because it is the set of all z whose distance |z — al from the center a equals p. Accordingly, 
its interior (“open circular disk”) is given by |z — a| < p, its interior plus the circle 
itself (“closed circular disk”) by |z — a| S p, and its exterior by |z — a| > p. As an 
example, sketch this for a = 1 + i and p = 2, to make sure that you understand these 
inequalities. 

An open circular disk |z — a| < pis also called a neighborhood of a or, more precisely, 
a p-neighborhood of a. And a has infinitely many of them, one for each value of p (> 0), 
and a is a point of each of them, by definition! 

In modern literature any set containing a p-neighborhood of a is also called a neigh- 
borhood of a. 

Figure 332 shows an open annulus (circular ring) py < |z — a| < p, which we shall 
need later. This is the set of all z whose distance |z — a] from a is greater than p, but 
less than pg. Similarly, the closed annulus p; = |z — a| S pg includes the two circles. 


Half-Planes. By the (open) upper half-plane we mean the set of all points z = x + iy 
such that y > 0. Similarly, the condition y < 0 defines the lower half-plane, x > 0 the 
right half-plane, and x < 0 the left half-plane. 
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For Reference: Concepts on Sets 
in the Complex Plane 


To our discussion of special sets let us add some general concepts related to sets that we 
shall need throughout Chaps. 13-18; keep in mind that you can find them here. 

By a point set in the complex plane we mean any sort of collection of finitely many 
or infinitely many points. Examples are the solutions of a quadratic equation, the 
points of a line, the points in the interior of a circle as well as the sets discussed just 
before. 

A set S is called open if every point of S' has a neighborhood consisting entirely of 
points that belong to S. For example, the points in the interior of a circle or a square form 
an open set, and so do the points of the right half-plane Re z = x > 0. 

A set S is called connected if any two of its points can be joined by a chain of finitely 
many straight-line segments all of whose points belong to S$. An open and connected set 
is called a domain. Thus an open disk and an open annulus are domains. An open square 
with a diagonal removed is not a domain since this set is not connected. (Why?) 

The complement of a set S in the complex plane is the set of all points of the complex 
plane that do not belong to S. A set S is called closed if its complement is open. For example, 
the points on and inside the unit circle form a closed set (“closed unit disk’’) since its 
complement |z| > 1 is open. 

A boundary point of a set S is a point every neighborhood of which contains both points 
that belong to S and points that do not belong to S. For example, the boundary points of 
an annulus are the points on the two bounding circles. Clearly, if a set S is open, then no 
boundary point belongs to S; if S is closed, then every boundary point belongs to S$. The 
set of all boundary points of a set S is called the boundary of S. 

A region is a set consisting of a domain plus, perhaps, some or all of its boundary points. 
WARNING! “Domain” is the modern term for an open connected set. Nevertheless, some 
authors still call a domain a “region” and others make no distinction between the two terms. 


Complex Function 


Complex analysis is concerned with complex functions that are differentiable in some 
domain. Hence we should first say what we mean by a complex function and then define 
the concepts of limit and derivative in complex. This discussion will be similar to that in 
calculus. Nevertheless it needs great attention because it will show interesting basic 
differences between real and complex calculus. 

Recall from calculus that a real function f defined on a set S of real numbers (usually an 
interval) is a rule that assigns to every x in S a real number f(x), called the value of f at x. 
Now in complex, S is a set of complex numbers. And a function f defined on S is a rule 
that assigns to every z in S a complex number w, called the value of f at z. We write 


w = f(z). 


Here z varies in S$ and is called a complex variable. The set S is called the domain of 
definition of f or, briefly, the domain of f. (In most cases S will be open and connected, 
thus a domain as defined just before.) 

Example: w = f(z) = 2? + 3zisa complex function defined for all z; that is, its domain 
S is the whole complex plane. 

The set of all values of a function fis called the range of f. 
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EXAMPLE 1 


EXAMPLE 2 


w is complex, and we write w = u + iv, where uw and v are the real and imaginary 
parts, respectively. Now w depends on z = x + iy. Hence u becomes a real function of x 
and y, and so does v. We may thus write 


w = f@) = ux, y) + it, y). 


This shows that a complex function f(z) is equivalent to a pair of real functions u(x, y) 
and u(x, y), each depending on the two real variables x and y. 


Function of a Complex Variable 


Let w = f(z) = 22 + 3z. Find w and v and calculate the value of f at z = 1 + 3i. 


Solution. u = Re f(z) x? y? + 3x and v = 2xy + 3y. Also, 


fd + 3i/) = 4 3° + 311 + 3) = 1-94 61 +34 9i SD. e TSt, 


This shows that w(1, 3) = —5 and v(1, 3) = 15. Check this by using the expressions for u and v. ‘| 


Function of a Complex Variable 
Let w = f(z) = 2iz + 6z, Find uw and v and the value of f at z = 3 + 4i. 


Solution. f(z) = 2i(x + iy) + 6(x — iy) gives u(x, y) = 6x — 2y and v(x, y) = 2x — 6y. Also, 


f@ + 4i) = 215 + 4) + 646 — 4) =i - 8 +3 - 241 5 — 231. 


Check this as in Example 1. | 


Remarks on Notation and Terminology 


1. Strictly speaking, f(z) denotes the value of f at z, but it is a convenient abuse of 
language to talk about the function f(z) (instead of the function f), thereby exhibiting the 
notation for the independent variable. 


2. We assume all functions to be single-valued relations, as usual: to each z in S there 
corresponds but one value w = f(z) (but, of course, several z may give the same value 
w = f(z), just as in calculus). Accordingly, we shall not use the term “multivalued 
function” (used in some books on complex analysis) for a multivalued relation, in which 
to a z there corresponds more than one w. 


Limit, Continuity 
A function f(z) is said to have the limit / as z approaches a point zp, written 
(1) lim f@ = 1, 


if f is defined in a neighborhood of zo (except perhaps at Zp itself) and if the values of 
fare “close” to / for all z “close” to zo; in precise terms, if for every positive real € we can 
find a positive real 6 such that for all z # Zo in the disk lz — zl <6 (Fig. 333) we have 


(2) If) — il <e; 
geometrically, if for every z # Zo in that 6-disk the value of f lies in the disk (2). 


Formally, this definition is similar to that in calculus, but there is a big difference. 
Whereas in the real case, x can approach an xX, only along the real line, here, by definition, 
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EXAMPLE 3 
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z may approach Zo from any direction in the complex plane. This will be quite essential 
in what follows. 
If a limit exists, it is unique. (See Team Project 24.) 


A function f(z) is said to be continuous at z = Zo if f(zo) is defined and 
3) lim f(@) = feo). 
Note that by definition of a limit this implies that f(z) is defined in some neighborhood 


of Z0- 
f(z) is said to be continuous in a domain if it is continuous at each point of this domain. 


nee c x 
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Fig. 333. Limit 


Derivative 


The derivative of a complex function f at a point zo is written f'(zo) and is defined by 


fo + AZ) — fo) 
Az 


4) f'@o) = lt, 


provided this limit exists. Then f is said to be differentiable at zo. If we write Az = z — Zo, 
we have z = zg + Az and (4) takes the form 


f(z) — fo) 


Kh a A 


4) fo) = lim 


Now comes an important point. Remember that, by the definition of limit, f(z) is defined 
in a neighborhood of zo and z in (4’) may approach zg from any direction in the complex 
plane. Hence differentiability at z) means that, along whatever path z approaches Zo, the 
quotient in (4’) always approaches a certain value and all these values are equal. This is 
important and should be kept in mind. 


Differentiability. Derivative 
The function f(z) = 2? is differentiable for all z and has the derivative f'(2 = 2z because 
(c+ Ay?-2 +22 Az + (Ay? - 2 


t : 1 4 = > 
f@ = jim, ie um, Ae Jim, @z + Az) = 2z. | 
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EXAMPLE 4 


DEFINITION 


The differentiation rules are the same as in real calculus, since their proofs are literally 
the same. Thus for any differentiable functions f and g and constant c we have 


’ ’ ’ ’ ’ , , , TY Fi — fg" 
(ef) >of) Fe) =f te, Us) —fe rie, (3) 7 


as well as the chain rule and the power rule (2")' = nz (n integer). 
Also, if f(z) is differentiable at zg, it is continuous at z9. (See Team Project 24.) 


z not Differentiable 


It may come as a surprise that there are many complex functions that do not have a derivative at any point. For 
instance, f(z) = Z = x — iy is such a function. To see this, we write Az = Ax + iAy and obtain 


f(c@t+Azd-f@ («+Az—-z Az Ax—idy 


(5) ees 
Az Az Az Ax+tiAy 
If Ay = 0, this is +1. If Ax = 0, this is —1. Thus (5) approaches +1 along path I in Fig. 334 but —1 along 
path II. Hence, by definition, the limit of (5) as Az — 0 does not exist at any z. ia 
y 


Fig. 334. Paths in (5) 


Surprising as Example 4 may be, it merely illustrates that differentiability of a complex 
function is a rather severe requirement. 

The idea of proof (approach of z from different directions) is basic and will be used 
again as the crucial argument in the next section. 


Analytic Functions 


Complex analysis is concerned with the theory and application of “analytic functions,” 
that is, functions that are differentiable in some domain, so that we can do “calculus in 
complex.” The definition is as follows. 


Analyticity 


A function f(z) is said to be analytic in a domain D if f(z) is defined and differentiable 
at all points of D. The function f(z) is said to be analytic at a point z = Zo in D if 
F(Z) 1s analytic in a neighborhood of Zo. 

Also, by an analytic function we mean a function that is analytic in some domain. 


Hence analyticity of f(z) at z9 means that f(z) has a derivative at every point in some 
neighborhood of zo (including zo itself since, by definition, zg is a point of all its 
neighborhoods). This concept is motivated by the fact that it is of no practical interest 
if a function is differentiable merely at a single point zg but not throughout some 
neighborhood of zo. Team Project 24 gives an example. 

A more modern term for analytic in D is holomorphic in D. 
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EXAMPLE 5_ Polynomials, Rational Functions 


The nonnegative integer powers 1, z, aa 


that is, functions of the form 


f@) 


where Co, ***, Cy are complex constants. 


are analytic in the entire complex plane, and so are polynomials, 


The quotient of two polynomials g(z) and h(z), 


is called a rational function. This fis analytic except at the points where h(z) = 0; here we assume that common 


factors of g and / have been canceled. 


Many further analytic functions will be considered in the next sections and chapters. | 


The concepts discussed in this section extend familiar concepts of calculus. Most 
important is the concept of an analytic function, the exclusive concern of complex 
analysis. Although many simple functions are not analytic, the large variety of remaining 
functions will yield a most beautiful branch of mathematics that is very useful in 


engineering and physics. 


1-8 | REGIONS OF PRACTICAL INTEREST 
Determine and sketch or graph the sets in the complex plane 
given by 

1. |z+1-5i| =3 
~0< {zg)<1 
.am<|z-442i| <37 
—7 <Imz<7 


i larg z| < 47 

. Re (l/z) < 1 

Rez2=-1 

«lee a| = les 

. WRITING PROJECT. Sets in the Complex Plane. 
Write a report by formulating the corresponding 
portions of the text in your own words and illustrating 
them with examples of your own. 


COMPLEX FUNCTIONS AND THEIR DERIVATIVES 


err avan bk wn 


10-12 | Function Values. Find Re f, and Im f and their 
values at the given point z. 

10. f(z) = 522 — 12g +3 + 2iat4 — 3i 

11. f(2) = 1/ —zatl—-i 

12. f(z) = (z — 2)/(z + 2) at 8i 

13. CAS PROJECT. Graphing Functions. Find and graph 


Re f, Im f, and || as surfaces over the z-plane. Also 
graph the two families of curves Re f(z) = const and 


PROBLEM SET 143-3 


Im f(z) = const in the same figure, and the curves 
| f(z)| = const in another figure, where (a) f(z) = 2, 


(b) f(@) = 1/z, © f@ = z*. 


14-17 | Continuity. Find out, and give reason, whether 
f(z) is continuous at z = Oif f(0) = O and for z # O the 
function f is equal to: 
14. (Re z”)/|z| 

16. (Im z”)/|z|? 


15. |z|? Im (1/2) 
17. (Re 2)/(1 — Izl) 


18-23 
of 

18. (c — i/(< + iati 19. (z — 41% at = 3 + 4i 
20. (1.5z + 2i)/(3iz — 4) at any z. Explain the result. 

21. i(1 — z)" at 0 

22. (iz? + 32?)3 at 2i 23. 2/(z + iP ati 

24. TEAM PROJECT. Limit, Continuity, Derivative 
(a) Limit. Prove that (1) is equivalent to the pair of 


relations 


jim Re f(z) = Rel, 
Zo 


Differentiation. Find the value of the derivative 


jim Im f(z) = Im/. 

Zo 

(b) Limit. If lim f(x) exists, show that this limit is 
unique. a 

(c) Continuity. If z1, z2,--- are complex numbers for 
which lim 2, = a,and if f(z) is continuous at z = a, 
show that lim f(@n) = f(a). 
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(d) Continuity. If f(z) is differentiable at zo, show that 25. WRITING PROJECT. Comparison with Calculus. 


f(z) is continuous at Zo. Summarize the second part of this section beginning with 
(e) Differentiability. Show that f(z) = Re z = xis not Complex Function, and indicate what is conceptually 
differentiable at any z. Can you find other such functions? analogous to calculus and what is not. 


(f) Differentiability. Show that f(z) = |z|? is dif- 
ferentiable only at z = 0; hence it is nowhere analytic. 


13.4 Cauchy—Riemann Equations. 
Laplace’s Equation 


THEOREM —-1 


As we saw in the last section, to do complex analysis (i.e., “calculus in the complex”) on 
any complex function, we require that function to be analytic on some domain that is 
differentiable in that domain. 

The Cauchy-Riemann equations are the most important equations in this chapter 
and one of the pillars on which complex analysis rests. They provide a criterion (a test) 
for the analyticity of a complex function 


w = f(z) = ux, y) + iv, y). 


Roughly, fis analytic in a domain D if and only if the first partial derivatives of u and v 
satisfy the two Cauchy-Riemann equations* 


(1) Uy = Vy, Uy = —Vy 


everywhere in D; here uz = du/dx and uy = du/dy (and similarly for v) are the usual 
notations for partial derivatives. The precise formulation of this statement is given in 
Theorems 1 and 2. 


Example: f(z) = z? = x? — y® + 2ixy is analytic for all z (see Example 3 in Sec. 13.3), 
and u =x” — y” and v = 2xy satisfy (1), namely, uz = 2x = vy as well as uy = 
—2y = —vy. More examples will follow. 


Cauchy—Riemann Equations 


Let f(z) = u(x, y) + iv(x, y) be defined and continuous in some neighborhood of a 
point z = x + iy and differentiable at z itself. Then, at that point, the first-order 
partial derivatives of u and v exist and satisfy the Cauchy—Riemann equations (1). 

Hence, if f(z) is analytic in a domain D, those partial derivatives exist and satisfy 
(1) at all points of D. 


4The French mathematician AUGUSTIN-LOUIS CAUCHY (see Sec. 2.5) and the German mathematicians 
BERNHARD RIEMANN (1826-1866) and KARL WEIERSTRASS (1815-1897; see also Sec. 15.5) are the 
founders of complex analysis. Riemann received his Ph.D. (in 1851) under Gauss (Sec. 5.4) at Géttingen, where 
he also taught until he died, when he was only 39 years old. He introduced the concept of the integral as it is 
used in basic calculus courses, and made important contributions to differential equations, number theory, and 
mathematical physics. He also developed the so-called Riemannian geometry, which is the mathematical 
foundation of Einstein’s theory of relativity; see Ref. [GenRef9] in App. 1. 
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PROOF 


CHAP. 13. Complex Numbers and Functions. Complex Differentiation 


By assumption, the derivative f'(z) at z exists. It is given by 


+ Az) - 
(2) f@ = lim f@ z) FQ 


Az>0 Az 


The idea of the proof is very simple. By the definition of a limit in complex (Sec. 13.3), 
we can let Az approach zero along any path in a neighborhood of z. Thus we may choose 
the two paths I and II in Fig. 335 and equate the results. By comparing the real parts we 
shall obtain the first Cauchy—Riemann equation and by comparing the imaginary parts the 
second. The technical details are as follows. 

We write Az = Ax + i Ay. Then z + Az =x + Ax + i(y + Ay), and in terms of u 
and v the derivative in (2) becomes 


; [wu + Ax, y + Ay) + v(x + Ax, y + Ay)] — [u@, y) + iv, y)] 
(3) f= Jim eT 
ZS x + iAy 


We first choose path I in Fig. 335. Thus we let Ay > 0 first and then Ax +0. After Ay 
is zero, Az = Ax. Then (3) becomes, if we first write the two u-terms and then the two 
v-terms, 


P — ux + Ax,y) — ux, y) v(x + Ax, y) — v(x, y) 
f(@ = lim + i lim : 
Ax—>0 Ax Ax—0 Ax 


Fig. 335. Paths in (2) 


Since f'(z) exists, the two real limits on the right exist. By definition, they are the partial 
derivatives of u and v with respect to x. Hence the derivative f (2) of f(z) can be written 


(4) f (2) = ty + ws, 


Similarly, if we choose path II in Fig. 335, we let Ax — 0 first and then Ay — 0. After 
Ax is zero, Az = i Ay, so that from (3) we now obtain 


u(x, y + Ay) — uG, v(x,y + Ay) — v(x, 


Since f(z) exists, the limits on the right exist and give the partial derivatives of u and v 
with respect to y; noting that 1/i = —i, we thus obtain 


(5) f@ = —iuy + vy. 


The existence of the derivative f(z) thus implies the existence of the four partial derivatives 
in (4) and (5). By equating the real parts uz, and v, in (4) and (5) we obtain the first 
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EXAMPLE 1 


THEOREM 2 


EXAMPLE 2 


EXAMPLE 3 


Cauchy—Riemann equation (1). Equating the imaginary parts gives the other. This proves 
the first statement of the theorem and implies the second because of the definition of 
analyticity. a 


Formulas (4) and (5) are also quite practical for calculating derivatives f (z), as we shall see. 


Cauchy—Riemann Equations 


f@= z° is analytic for all z. It follows that the Cauchy—Riemann equations must be satisfied (as we have 
verified above). 


For f(z) = Z = x — iy we have u = x, v = —yand see that the second Cauchy—Riemann equation is satisfied, 
Uy = —vxz = 0, but the first is not: uv, = 1 # vy = —1. We conclude that f(z) = Z is not analytic, confirming 
Example 4 of Sec. 13.3. Note the savings in calculation! B 


The Cauchy—Riemann equations are fundamental because they are not only necessary but 
also sufficient for a function to be analytic. More precisely, the following theorem holds. 


Cauchy—Riemann Equations 


If two real-valued continuous functions u(x, y) and U(x, y) of two real variables x 
and y have continuous first partial derivatives that satisfy the Cauchy—Riemann 
equations in some domain D, then the complex function f(z) = u(x, y) + iv(x, y) is 
analytic in D. 


The proof is more involved than that of Theorem | and we leave it optional (see App. 4). 
Theorems | and 2 are of great practical importance, since, by using the Cauchy—Riemann 
equations, we can now easily find out whether or not a given complex function is analytic. 


Cauchy—Riemann Equations. Exponential Function 


Is f(z) = u(x, y) + iv(x, y) = e*(cos y + isin y) analytic? 


Solution. We have u = e“cos y, v = e” sin y and by differentiation 
Uy = e cosy, Vy = e* cosy 
= ed — we gt 
Uy = —e" siny, Vy, =e" siny. 


We see that the Cauchy—Riemann equations are satisfied and conclude that f(z) is analytic for all z. (f(z) will 
be the complex analog of e” known from calculus.) 


An Analytic Function of Constant Absolute Value Is Constant 


The Cauchy—Riemann equations also help in deriving general properties of analytic functions. 
For instance, show that if f(z) is analytic in a domain D and |f(z)| = k = const in D, then f(z) = const in 
D. (We shall make crucial use of this in Sec. 18.6 in the proof of Theorem 3.) 


Solution. By assumption, |f|? = |u + iv|? = u? + v? = k. By differentiation, 


Uuy, + VVy = 0, 


Ululy + UVy = 0. 
Now use v, = —Uuy in the first equation and v, = uz in the second, to get 


(a) Ul — UUy = 0, 
(6) 
(b)  uuy — vug = 0. 
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THEOREM 3 


PROOF 
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To get rid of uy, multiply (6a) by u and (6b) by v and add. Similarly, to eliminate u,,, multiply (6a) by —v and 
(6b) by w and add. This yields 


(u2 + v)u, = 0, 


(u? + v)uy = 0. 


If ke? =u2? +0? = 0, then u = v = 0; hence f = 0. If k? = yw? + v2 0, then uz, = uy = 0. Hence, by the 
Cauchy—Riemann equations, also uz = vy = 0. Together this implies wu = const and v = const; hence 


f = const. a 


We mention that, if we use the polar form z = r(cos 0 + isin @) and set f(z) = u(r, 0) + 
iv(r, 0), then the Cauchy—Riemann equations are (Prob. 1) 


1 
Up = — Up, (r > 0). 
(7) ie i 
Ur = 7; “6 


Laplace’s Equation. Harmonic Functions 


The great importance of complex analysis in engineering mathematics results mainly from 
the fact that both the real part and the imaginary part of an analytic function satisfy Laplace’ s 
equation, the most important PDE of physics. It occurs in gravitation, electrostatics, fluid 
flow, heat conduction, and other applications (see Chaps. 12 and 18). 


Laplace’s Equation 


Tf f(z) = u(x, y) + wx, y) is analytic in a domain D, then both u and v satisfy 
Laplace’s equation 


(8) V7u = Une + Uyy = 0 
(V2 read “nabla squared’’) and 
(9) Vv = Ung + Vyy = 0, 


in D and have continuous second partial derivatives in D. 


Differentiating u, = v, with respect to x and uy = —v, with respect to y, we have 


(10) Ugy = Vygs Uyy = —Uzgy. 


Now the derivative of an analytic function is itself analytic, as we shall prove later (in 
Sec. 14.4). This implies that u and v have continuous partial derivatives of all orders; in 
particular, the mixed second derivatives are equal: Vy, = Uyy. By adding (10) we thus 
obtain (8). Similarly, (9) is obtained by differentiating u,, = v, with respect to y and 


Uy = —U,, with respect to x and subtracting, using Ugy = Uy. (| 


y 
Solutions of Laplace’s equation having continuous second-order partial derivatives are called 
harmonic functions and their theory is called potential theory (see also Sec. 12.11). Hence 
the real and imaginary parts of an analytic function are harmonic functions. 
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EXAMPLE 4 


If two harmonic functions u and vu satisfy the Cauchy—Riemann equations in a domain 
D, they are the real and imaginary parts of an analytic function fin D. Then vu is said to 
be a harmonic conjugate function of u in D. (Of course, this has absolutely nothing to 
do with the use of “conjugate” for Z.) 


How to Find a Harmonic Conjugate Function by the Cauchy—Riemann Equations 


Verify that u = x2 — y* — y is harmonic in the whole complex plane and find a harmonic conjugate function 
v of u. 


Solution. V?u = 0 by direct calculation. Now uy, = 2x and Uy = —2y — 1. Hence because of the Cauchy— 
Riemann equations a conjugate uv of u must satisfy 


Vy = Uy = 2x, Uy = —Uuy = 2y t+ 1. 
Integrating the first equation with respect to y and differentiating the result with respect to x, we obtain 


dh 
v = 2xy + A(x), Vz = 2y+—. 
dx 


A comparison with the second equation shows that dh/dx = 1. This gives h(x) = x + c. Hence v = 2xy +x +c 
(c any real constant) is the most general harmonic conjugate of the given u. The corresponding analytic function is 


f(z) =u + iv x? y? y + i(QQxy +x 4+ c) 2 t+ iz t ic. i] 


Example 4 illustrates that a conjugate of a given harmonic function is uniquely determined 
up to an arbitrary real additive constant. 

The Cauchy—Riemann equations are the most important equations in this chapter. Their 
relation to Laplace’s equation opens a wide range of engineering and physical applications, 
as shown in Chap. 18. 


PROBLEM SET 13-4 


1. Cauchy—Riemann equations in polar form. Derive (7) 14. v = xy 15. u= x/ (x? + y?) 
from (1). 16. u = sinxcoshy 17. v = (2x + Dy 

2-11] CAUCHY-RIEMANN EQUATIONS 18. uw = x° — 3xy" 
Are the following functions analytic? Use (1) or (7). 19. v = e* sin 2y 

2. f(z) = izz 20. Laplace’s equation. Give the details of the derivative 
3. f(z) = e 2" (cos 2y — isin 2y) ats 

4. f(z) = e” (cos y — isin y) Determine a and D so that the given function is 
5. f(z) = Re (22) — i Im (2?) harmonic and find a harmonic conjugate. 

6. f(z) = 1/@ - 2) 7. f(2) = i/z8 21. u = e”” cos av 

8. f(z) = Arg 277z 22. u = cos ax cosh 2y 

9. f(z) = 307 /(23 + 42772) 23. u = ax® + bxy 

10. f(z) = In |z| + i Argz 24. u = cosh ax cos y 

11. f(z) = cos x cosh y — isin x sinh y 25. CAS PROJECT. Equipotential Lines. Write a 

program for graphing equipotential lines uw = const of 

HARMONIC FUNCTIONS a harmonic function u and of its conjugate v on the 
Are the following functions harmonic? If your answer same axes. Apply ane program to (a) u = = y, 


is yes, find a corresponding analytic function f(z) = Vv = 2xy, (b) u = x 


u(x, y) + iv(x, y). 
12,.u=x2% + y? 


3xy”, v= 3x?y y°. 
26. Apply the program in Prob. 25 to u =e” cosy, 
13. u = xy v = e” sin y and to an example of your own. 
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27. Harmonic conjugate. Show that if u is harmonic and 30. TEAM PROJECT. Conditions for f(z) = const. Let 


v is a harmonic conjugate of u, then u is a harmonic f(z) be analytic. Prove that each of the following 
conjugate of —v. conditions is sufficient for f(z) = const 
28. Illustrate Prob. 27 by an example. (a) Re f(z) = const 
29. Two further formulas for the derivative. Formulas (4), (b) Im f(z) = const 
(5), and (11) (below) are needed from time to time. Derive (c) f'() =0 
Ql) f@=u,- itty, f= Vy + Wy. (d) |f(2| = const (see Example 3) 


13.5 Exponential Function 


In the remaining sections of this chapter we discuss the basic elementary complex 
functions, the exponential function, trigonometric functions, logarithm, and so on. They 
will be counterparts to the familiar functions of calculus, to which they reduce when z = x 
is real. They are indispensable throughout applications, and some of them have interesting 
properties not shared by their real counterparts. 

We begin with one of the most important analytic functions, the complex exponential 
function 


e’" also written eXp Z. 
The definition of e* in terms of the real functions e”, cos y, and sin y is 


(1) e* = e*(cosy + isin y). 


This definition is motivated by the fact the e* extends the real exponential function e* of 
calculus in a natural fashion. Namely: 


(A) e* = e” for real z = x because cos y = | and sin y = 0 when y = 0. 
(B) e* is analytic for all z. (Proved in Example 2 of Sec. 13.4.) 


(C) The derivative of e* is e*, that is, 
(2) ey. =<. 
This follows from (4) in Sec. 13.4, 
(e*)' = (e” cos y)x + i(e” sin y), = e* cos y + ie” siny = e”. 
REMARK. This definition provides for a relatively simple discussion. We could define e* 
by the familiar series 1 + x + x /2! fe x3/3! + +++ with x replaced by z, but we would 
then have to discuss complex series at this very early stage. (We will show the connection 


in Sec. 15.4.) 


Further Properties. A function f(z) that is analytic for all z is called an entire function. 
Thus, e* is entire. Just as in calculus the functional relation 


(3) ets = prigi2 
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holds for any zy = x1 + iy, and zg = Xg + iyo. Indeed, by (1), 
ee = e“\(cos yy + isin y,)e"2(cos yo + isin yo). 


Since e“e*2 = e”!*” for these real functions, by an application of the addition formulas 
for the cosine and sine functions (similar to that in Sec. 13.2) we see that 


v1 + Car Zy4+Z2 


ete? = € cos (yy + yg) + isin (yy + ye)] = e€ 
as asserted. An interesting special case of (3) is z1 = x, Zz = iy; then 
(4) e* = eve”, 
Furthermore, for z = ty we have from (1) the so-called Euler formula 
(5) e” = cosy + isiny. 
Hence the polar form of a complex number, z = r(cos 6 + i sin #), may now be written 
(6) z= re’. 
From (5) we obtain 
(7) ae 
as well as the important formulas (verify!) 
(8) en =i em = -1, gO UF ap eT = —], 
Another consequence of (5) is 


(9) le”| = |cos y + isiny| = Vos? y + sin? y = 1. 


That is, for pure imaginary exponents, the exponential function has absolute value 1, a 
result you should remember. From (9) and (1), 


(10) le*| = e. Hence arge* = y+2nm (n=0,1,2,-°-), 


since |e*| = e” shows that (1) is actually e* in polar form. 
From |e*| = e” # 0 in (10) we see that 


(11) e #0 for all z. 


So here we have an entire function that never vanishes, in contrast to (nonconstant) 
polynomials, which are also entire (Example 5 in Sec. 13.3) but always have a zero, as 
is proved in algebra. 
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Periodicity of e* with period 277i, 
(12) pura e for all z 


is a basic property that follows from (1) and the periodicity of cos y and sin y. Hence all 
the values that w = e* can assume are already assumed in the horizontal strip of width 27 


(13) = KPSoar (Fig. 336). 


This infinite strip is called a fundamental region of e*. 


Function Values. Solution of Equations 


Computation of values from (1) provides no problem. For instance, 


et 4-0-6 — 61-4(cos 0.6 — i sin 0.6) = 4.055(0.8253 — 0.56461) = 3.347 — 2.289; 


[et 4-1-6] = ol4 = 4.055, Arg et 406i = _(6, 


To illustrate (3), take the product of 


ets e°(cos 1 + isin 1) and et e*(cos 1 — isin 1) 


and verify that it equals e7e*(cos? 1 + sin? 1) = e& = grr one). 
To solve the equation e* = 3 + 4i, note first that \e*| = e” = 5,x = In5 = 1.609 is the real part of all 
solutions. Now, since e” = 5, 


e* cos y = 3, e’ siny = 4, cos y = 0.6, siny = 0.8, y = 0.927. 


Ans. z = 1.609 + 0.927i + 2n7ri (n = 0, 1, 2,--+). These are infinitely many solutions (due to the periodicity 
of e*). They lie on the vertical line x = 1.609 at a distance 277 from their neighbors. fa 


To summarize: many properties of e* = exp z parallel those of e”; an exception is the 
periodicity of e* with 27ri, which suggested the concept of a fundamental region. Keep 
in mind that e* is an entire function. (Do you still remember what that means?) 


a 


Fig. 336. Fundamental region of the 
exponential function e” in the z-plane 


PROBLEEM—SET 43-5 


1. e7 is entire. Prove this. 8-13 Polar Form. Write in exponential form (6): 
2-7)| Function Values. Find e* in the form u + iv Be WE Ph eH 
and |e*| if z equals 10. Vi, Vi 11. —6.3 
2.34 4i 3. 2mi(1 + i) 12. 1/(1 — g) 13. 1 +i 
4. 0.6 — 1.87 5.24 377i 14-17| Real and Imaginary Parts. Find Re and Im of 
6. 117i/2 7. V2 4 aTi 14. e7™* 15. exp (2?) 
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16. e'/” 17. exp (z?) 

18. TEAM PROJECT. Further Properties of the Ex- 
ponential Function. (a) Analyticity. Show that e* is 
entire. What about e!/*? e? e"(cos ky + i sin ky)? (Use 
the Cauchy—Riemann equations.) 

(b) Special values. Find all z such that (i) e* is real, 
(ii) |e~*| < 1, (iii) e* = e*. 

(c) Harmonic function. Show that u = e™” cos 
(x2/2 — y?/2) is harmonic and find a conjugate. 


(d) Uniqueness. It is interesting that f(z) = e* is 
uniquely determined by the two properties f(x + i0) = 
e” and f'@ = f(z), where f is assumed to be entire. 
Prove this using the Cauchy—Riemann equations. 


19-22 | Equations. Find all solutions and graph some 
of them in the complex plane. 

19. e* =1 20. & =4+4 3: 
21. e =0 22." =-=2 


13.6. Trigonometric and Hyperbolic Functions. 


Euler’s Formula 


Just as we extended the real e* to the complex e* in Sec. 13.5, we now want to extend 
the familiar real trigonometric functions to complex trigonometric functions. We can do 
this by the use of the Euler formulas (Sec. 13.5) 


e” = cosx + isinx, e ” =cosx — isinx. 


By addition and subtraction we obtain for the real cosine and sine 


1. - 
sinx = —(e” — e”), 


cos x = 3(e + e~%), - 
i 


This suggests the following definitions for complex values z = x + iy: 


lee: 
(1) cos z = $(e* + e~%), sing = Fe (e* — e”). 
i 


It is quite remarkable that here in complex, functions come together that are unrelated in 
real. This is not an isolated incident but is typical of the general situation and shows the 
advantage of working in complex. 

Furthermore, as in calculus we define 


5 sin z COS Z 
tan z = cotz = —= 
(2) : COS Z’ a sin Z 
and 
1 1 
go a 
(3) sec Z = Coga> esc z = Gao: 


Since e* is entire, cos z and sin z are entire functions. tan z and sec z are not entire; they 
are analytic except at the points where cos z is zero; and cot z and csc z are analytic except 
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where sin zis zero. Formulas for the derivatives follow readily from (e*)’ = e* and (1)-(3): 
as in calculus, 


(4) (cos z)’ = —sin z, (sin z)’ = cos z, (tan z)' = sec” ge: 
etc. Equation (1) also shows that Euler’s formula is valid in complex: 
(5) e* =cosz + isinz for all z. 


The real and imaginary parts of cos z and sin z are needed in computing values, and they 
also help in displaying properties of our functions. We illustrate this with a typical example. 


Real and Imaginary Parts. Absolute Value. Periodicity 


Show that 
(a) cos z = cos x cosh y — isin x sinh y 
me (b) sin z = sinx cosh y + icos x sinh y 
and 
(a) — |cos z|? = cos? x + sinh? y 
(7) 


(b) lsin z|? = sin? x + sinh? y 
and give some applications of these formulas. 


Solution. From (1), 


cose = deh fe ett) 


- de—Y(cos x + isin x) + de4(cos x — isin x) 
= 3(e¥ +e”) cos x — Si(eY — e~) sin x. 
This yields (6a) since, as is known from calculus, 
(8) cosh y = 3(eY +e), sinh y = de —e%); 
(6b) is obtained similarly. From (6a) and cosh” y=1+ sinh? y we obtain 
|cos z|? = (cos” x) + sinh? y) + sin? x sinh? y. 


Since sin? x + cos? x = 1, this gives (7a), and (7b) is obtained similarly. 

For instance, cos (2 + 3i) = cos 2 cosh 3 — isin 2 sinh 3 = —4.190 — 9.109i. 

From (6) we see that sin z and cos z are periodic with period 27, just as in real. Periodicity of tan z and cot z 
with period 77 now follows. 

Formula (7) points to an essential difference between the real and the complex cosine and sine; whereas 
|cos x| S 1 and |sin x| S 1, the complex cosine and sine functions are no longer bounded but approach infinity 
in absolute value as y > %, since then sinh y > © in (7). B 


Solutions of Equations. Zeros of cos z and sin z 


Solve (a) cos z = 5 (which has no real solution!), (b) cos z = 0, (c) sin z = 0. 
Solution. (a) e?"* — 10e + 1 = 0 from (1) by multiplication by e”. This is a quadratic equation in e, 
with solutions (rounded off to 3 decimals) 


e* =e Yt™ — 5 + V/25 —1 = 9.899 and 0.101. 


Thus e~¥ = 9.899 or 0.101, e” = 1, y = 2.292, x = 2n7. Ans. z = +2n7 + 2.292i (n = 0, 1, 2,-°:). 
Can you obtain this from (6a)? 
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(b) cos x = 0, sinh y = 0 by (7a), y = 0. Ans. z = +4 Qn + l)ar (n = 0, 1, 2,°--). 
(c) sinx = 0, sinh y = 0 by (7b), Ans. z = £n7 (n = 0, 1, 2,°--). 
Hence the only zeros of cos z and sin z are those of the real cosine and sine functions. 2] 


General formulas for the real trigonometric functions continue to hold for complex 
values. This follows immediately from the definitions. We mention in particular the 
addition rules 


(0) cos (Z1 + Za) = COS Z1 COS Zy + SIN Z1 SiN Zo 


sin (z1 + Za) = sin z1COS Za + Sin Zg COS Z4 
and the formula 
(10) cos? z + sin?z = 1. 
Some further useful formulas are included in the problem set. 


Hyperbolic Functions 


The complex hyperbolic cosine and sine are defined by the formulas 
(11) cosh z = 5(e* + e%), sinh z = $(e* — e7*). 


This is suggested by the familiar definitions for a real variable [see (8)]. These functions 
are entire, with derivatives 


(12) (cosh z)’ = sinh z, (sinh z)’ = cosh z, 


as in calculus. The other hyperbolic functions are defined by 


sinh z cosh z 
tanh z = , coth z = — : 
cosh z sinh z 
(13) 
sech l csch ? 
cosh Zz «sinh Zz 


Complex Trigonometric and Hyperbolic Functions Are Related. If in (11), we replace z 
by iz and then use (1), we obtain 


(14) cosh iz = cos z, sinh iz = isin z. 
Similarly, if in (1) we replace z by iz and then use (11), we obtain conversely 
(15) cos iz = cosh z, sin iz = i sinh z. 


Here we have another case of unrelated real functions that have related complex analogs, 
pointing again to the advantage of working in complex in order to get both a more unified 
formalism and a deeper understanding of special functions. This is one of the main reasons 
for the importance of complex analysis to the engineer and physicist. 
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PROBLEM SET 143-6 


1-4| FORMULAS FOR HYPERBOLIC FUNCTIONS 11. sin wi, cos (g7 — Tri) 


Show that 12. cos dar i, COS [aar(1 + i)] 


1. cosh z = cosh x cos y + i sinh x sin y 13-15 | Equations and Inequalities. Using the defini- 
tions, prove: 


sinh z = sinh x cos y + icoshx sin y. 
13. cosz is even, cos(—z) = cosz, and sinz is odd, 
2. cosh (z, + zg) = cosh z, cosh zy + sinh z, sinh zo sin (—z) = —sin z. 
14. |sinh y| S|cosz| S cosh y,|sinh y| S |sin z| S cosh y. 
Conclude that the complex cosine and sine are not 
bounded in the whole complex plane. 


sinh (zy + Za) = sinh z, cosh zz + cosh Zz sinh Zo. 


3. cosh” z — sinh” z = 1, cosh” z + sinh? z = cosh 2 

oe ° ii ‘ rade set ie acs 15. sin z1. cos zo = 4fsin (zy + Za) + sin(z1 — Z2)] 
4. Entire Functions. Prove that cos z, sin z, cosh z, and 

sinh z are entire. 16-19 | Equations. Find all solutions. 
5. Harmonic Functions. Verify by differentiation that 16. sinz = 100 17. cosh z = 0 

Im cos z and Re sin z are harmonic. 18. odes =i 19: sishe = 0 


20. Ret I . Show that 
6-12 Function Values. Find, in the form u + iv, Etae Sane SRE Bee 


6. sin 277i 7. cosi, sini Re tan z = eee 
8. cos wi, cosh 7i cos” x + sinh® y 
9. cosh (—1 + 21), cos (—2 — i) iia Ra ae 
10. sinh (3 + 41), cosh (3 + 4/7) cos? x + sinh? y 


13.7 Logarithm. General Power. Principal Value 


We finally introduce the complex logarithm, which is more complicated than the real 
logarithm (which it includes as a special case) and historically puzzled mathematicians 
for some time (so if you first get puzzled—which need not happen!—be patient and work 
through this section with extra care). 

The natural logarithm of z = x + iy is denoted by In z (sometimes also by log z) and 
is defined as the inverse of the exponential function; that is, w = In z is defined for z # 0 
by the relation 


e” = z, 


(Note that z= 0 is impossible, since e’” # 0 for all w; see Sec. 13.5.) If we set w = u + iv 
and z = re’’, this becomes 


Now, from Sec. 13.5, we know that e“*” has the absolute value e“ and the argument v. 
These must be equal to the absolute value and argument on the right: 
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EXAMPLE=-1 


e“ =r gives u = Inr, where Inr is the familiar real natural logarithm of the positive 


number r = |z|. Hence w = u + iv = Inz is given by 

(1) Inz=Inr+ i0 (r = |z| > 0, 6 = argz). 
Now comes an important point (without analog in real calculus). Since the argument of 
z is determined only up to integer multiples of 277, the complex natural logarithm 
In z (z # 0) is infinitely many-valued. 


The value of In z corresponding to the principal value Arg z (see Sec. 13.2) is denoted 
by Ln z (Ln with capital L) and is called the principal value of In z. Thus 


(2) Lnz = In [z| + iArgz (z # 0). 


The uniqueness of Arg z for given z (# 0) implies that Ln z is single-valued, that is, a 
function in the usual sense. Since the other values of arg z differ by integer multiples of 277, 
the other values of In z are given by 


(3) Inz=Lnz + 2n77i (n = 1,2,-°-). 
They all have the same real part, and their imaginary parts differ by integer multiples 
of 277. 

If z is positive real, then Arg z = 0, and Ln z becomes identical with the real natural 
logarithm known from calculus. If z is negative real (so that the natural logarithm of 
calculus is not defined!), then Arg z = 7 and 

Lnz=In|lz| + wi z negative real). 

From (1) and e!™” = + for positive real r we obtain 

(4a) ot =s 


as expected, but since arg (e*) = y + 2n7z is multivalued, so is 


(4b) In (e*) = z+ 2n7i, n=0,1,-:- 


Natural Logarithm. Principal Value 


In 1 = 0, +277, 477i, --- Lnl=0 
In 4 = 1.386294 + 2n7i Ln 4 = 1.386294 
In(—1) = £77, £3771, £571, --- Ln (—1) = wi 
In (—4) = 1.386294 + (2n + 1)ari Ln (—4) = 1.386294 + ai 
Ini = wi/2, —327/2, 577i/2,--- Lni = 7i/2 
In 47 = 1.386294 + qi/2 + 2n7i Ln 4i = 1.386294 + ri/2 
In (—47) = 1.386294 — qi/2 + 2n7i Ln (—4i) = 1.386294 — qi/2 
In (3 — 41) = In5 + arg (3 — 4i) Ln (3 — 41) = 1.609438 — 0.927295i 


= 1.609438 — 0.927295i + 2n7ri (Fig. 337) @ 
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VU 
-0.9 + 62 - e 
-0.9+42- ° 
-0.9 + 2n- cy 
oS 1 $2 # 
-0.9-2% - r) 


Fig. 337. Some values of In (3 — 4/) in Example 1 
The familiar relations for the natural logarithm continue to hold for complex values, that is, 
(5) (a) In(z3z2) =Inzy + Inzg, (6) In (@3/z2) = Inz — Inze 


but these relations are to be understood in the sense that each value of one side is also 
contained among the values of the other side; see the next example. 


Illustration of the Functional Relation (5) in Complex 


Let 


If we take the principal values 
Ln z, = Lnzg = Zi, 


then (5a) holds provided we write In (zyz2) = In 1 = 277i; however, it is not true for the principal value, 
Ln (z1z2) = Ln 1 = 0. al 


Analyticity of the Logarithm 


For every n = 0, +1, 2,--- formula (3) defines a function, which is analytic, 
except at 0 and on the negative real axis, and has the derivative 


(6) (In z)’ = (z not 0 or negative real). 


We show that the Cauchy—Riemann equations are satisfied. From (1)—(3) we have 
1 
Inzg=Inr+i(0+c)= 3 In? + y?) + i( arctan? + c) 
x 


where the constant c is a multiple of 277. By differentiation, 


x eh 1 i 
x2 + y? Y 14+ (y/x? x 


Uy = 


bance eta) 
Uy = ; 
- x2 + y 7 1+ (y/x)? x? 
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EXAMPLE 3 


Hence the Cauchy—Riemann equations hold. [Confirm this by using these equations in polar 
form, which we did not use since we proved them only in the problems (to Sec. 13.4).] 
Formula (4) in Sec. 13.4 now gives (6), 


. x ; 1 yy. x > iy 1 
Inz)’ = uy + ivy = + ( ) =-, a 
(ng) vee x7 + y? : 1+ (y/x)? x? x7 + y? Zz 


Each of the infinitely many functions in (3) is called a branch of the logarithm. The 
negative real axis is known as a branch cut and is usually graphed as shown in Fig. 338. 
The branch for n = 0 is called the principal branch of In z. 


y 


Fig. 338. Branch cut for Inz 


General Powers 


General powers of a complex number z = x + iy are defined by the formula 
elnz 


(7) Zz =e (c complex, z # 0). 


Since In z is infinitely many-valued, z° will, in general, be multivalued. The particular value 


ze = °° Lnz 
is called the principal value of z°. 
Ifc =n = 1,2,---, then z” is single-valued and identical with the usual nth power of z. 
If c = —1, —2,---, the situation is similar. 


If c = 1/n, where n = 2, 3,---, then 
= Wz = ea) Inz (z # 0), 


the exponent is determined up to multiples of 27ri/n and we obtain the n distinct values 
of the nth root, in agreement with the result in Sec. 13.2. If c = p/q, the quotient of two 
positive integers, the situation is similar, and z° has only finitely many distinct values. 
However, if c is real irrational or genuinely complex, then z° is infinitely many-valued. 


General Power 


: Ss 7 = 
ié = e' M* = exp (ilnd’) exo i(2 aa 2nm)| =e een, 


All these values are real, and the principal value (n = 0) is el, 


Similarly, by direct calculation and multiplying out in the exponent, 


(1 + 27? = exp[(2 - ) In (1 + |] = exp[@2 — A {In V2 + gai + 2nTi}] 


= 2¢7/4*27" sin (4 In 2) + icos ( In 2)]. @ 
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It is a convention that for real positive z = x the expression z° means e° In Where In x 


is the elementary real natural logarithm (that is, the principal value Ln z (z = x > O) in 
the sense of our definition). Also, if z = e, the base of the natural logarithm, z° = e° is 
conventionally regarded as the unique value obtained from (1) in Sec. 13.5. 

From (7) we see that for any complex number a, 


(8) a” = e In Gh 


We have now introduced the complex functions needed in practical work, some of them 
(e*, cos z, sin z, cosh z, sinh z) entire (Sec. 13.5), some of them (tan z, cot z, tanh z, coth z) 
analytic except at certain points, and one of them (In z) splitting up into infinitely many 
functions, each analytic except at 0 and on the negative real axis. 

For the inverse trigonometric and hyperbolic functions see the problem set. 


PROBLEEM—SET 13-7 


1-4| VERIFICATIONS IN THE TEXT 26. (i)? a7. (=17** 
1. Verify the computations in Example 1. 28. (3 + 4iy¥3 
ae Mestty (0) fat ox = —vand ag = lh 29. How can you find the answer to Prob. 24 from the 
3. Prove analyticity of Ln z by means of the Cauchy— answer to Prob. 23? 
Riemann equations in polar form (Sec. 13.4). ; : 
30. TEAM PROJECT. Inverse Trigonometric and 


4. Prove (4a) and (4b). 


COMPLEX NATURAL LOGARITHM In z 
5-11| Principal Value Ln z. Find Ln z when z equals 


Hyperbolic Functions. By definition, the inverse sine 
w = arcsin z is the relation such that sin w = z. The 
inverse cosine w = arccos z is the relation such that 
cos w = z. The inverse tangent, inverse cotangent, 


ae eae inverse hyperbolic sine, etc., are defined and denoted 
7.4 — 4i 8 1ti in a similar fashion. (Note that all these relations are 
9. 0.6 + 0.8: 10. —15 + O.1i multivalued.) Using sin w = (e’” — e~*”)/(2i) and 
11. ei similar representations of cos w, etc., show that 
12-16} All Values of In z. Find all values and graph (a) arecos ¢ = —iIn(g + Vz" — 1) 
some of them in the complex plane. (b) arcsin z = —iln (iz + V1 — 2?) 


12. Ine 13. Inde (c) arccosh z = In(z + Ale a 1) 
14. In (—7) 15. In (e’) 5 
16. In (4 + 31) (d) arcsinh z = In(z + Vz" + 1) 
17. Show that the set of values of In (i?) differs from the ee ae 

set of values of 2 In i. (e) arctan z = 2 In i-z 
18-21 | Equations. Solve for z. lt+z 


18. Inz = —7i/2 19. Inz =4— 33 

20. Inz=e- Ti 21. Inz = 0.6 + 0.47 
General Powers. Find the principal value. 
Show details. 
22. (2i)** 

24. (1 - a'* 


23. (1 + jh? 
25. (-3)°? 


1 

(f) arctanh z = 5 In ie 
(g) Show that w = arcsin z is infinitely many-valued, 
and if w, is one of these values, the others are of the 
form wy+2n7 and 7 — wy + 2n77,n = 0,1,---. 
(The principal value of w = u + iv = arcsin zis defined 
to be the value for which —7/2 Su S 7/2 ifu 20 
and —77/2 <u < 7/2 ifv <0.) 
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CHAP TER—-13-REVIEW-QUESTIONS AND PROBLEMS 


1. Divide 15 + 23i by —3 + 7i. Check the result by 45, a+i)/d-a 16. en? eT m/2 
multiplication. 


17-20 Polar Form. Represent in polar form, with the 


2. What happens to a quotient if you take the complex alii 
principal argument. 


conjugates of the two numbers? If you take the absolute 


values of the numbers? 17. —4 - 4i 18.12+ 7, 12-i 
3. Write the two numbers in Prob. | in polar form. Find 19. —15i 20. 0.6 + 0.87 
the principal values of their arguments. 21-24) Roots. Find and graph all values of: 
4. State the definition of the derivative from memory. 2], \/81 22. V —-32i 
Explain the big difference from that in calculus. 23, W—1 24, W1 
5. What is an analytic function of a complex variable? 
25-30 | Analytic Functions. Find f(z) = u(x, y) + iv(x, y) 


6. Can a function be differentiable at a point without being 
analytic there? If yes, give an example. 


with u or v as given. Check by the Cauchy—Riemann equations 


: . for analyticity. 
7. State the Cauchy—Riemann equations. Why are they of 3 6 
basic importance? 25 u = xy 26. v = y/(x" + y") 
= 2x .: = 
8. Discuss how e*, cos z, sin z, cosh z, sinh z are related. 27.u=—-e ™ mn 2y ; 28. u = cos 3x cosh 3y 
9. In z is more complicated than In x. Explain. Give 29: 4 = exp(—(x" — y")/2) cos xy 
examples. 30. v = cos 2x sinh 2y 
10. How are general powers defined? Give an example. | 31-35] Special Function Values. Find the value of: 
Convert it to the form x + iy. . . 
31. cos (3 — i) 32. Ln (0.6 + 0.87) 
11-16} Complex Numbers. Find, in the form x + iy, 33. tani 
Showing — ' 34. sinh (1 + ari), sin(1 + wri) 
11. (2 + 3i) 12. (1 — i) 35. cosh (7 + Ti) 


13. 1/(4 + 3i) 14. Vi 


SUMMARY-OF-CHAPTER-13 


Complex Numbers and Functions. Complex Differentiation 


For arithmetic operations with complex numbers 
(1) z=x+ iy = re” =r(cos@ + isin6), 


r= [z| = Vx? + y?,6 = arctan (y/x), and for their representation in the complex 
plane, see Secs. 13.1 and 13.2. 

A complex function f(z) = u(x, y) + iv(x, y) is analytic in a domain D if it has 
a derivative (Sec. 13.3) 


f(z + Az) — f() 
Az 


(2) f@ = jim, 


everywhere in D. Also, f(z) is analytic at a point z = Zo if it has a derivative in a 
neighborhood of zo (not merely at Zo itself). 
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If f(z) is analytic in D, then u(x, y) and u(, y) satisfy the (very important!) 
Cauchy—Riemann equations (Sec. 13.4) 


(3) = a= 


everywhere in D. Then u and v also satisfy Laplace’s equation 
(4) Ux + Uyy = 0, Uxx + Vyy = 0 


everywhere in D. If u(x, y) and u(x, y) are continuous and have continuous partial 
derivatives in D that satisfy (3) in D, then f(z) = u(x, y) + iv(x, y) is analytic in 
D. See Sec. 13.4. (More on Laplace’s equation and complex analysis follows in 
Chap. 18.) 

The complex exponential function (Sec. 13.5) 


(5) e* = expz = e” (cosy + isin y) 


reduces to e” if z = x (y = 0). It is periodic with 277i and has the derivative e*. 
The trigonometric functions are (Sec. 13.6) 

cos z = 4(e* + e~) = cosx cosh y — isin x sinh y 

(6) 1 

sinz = 5 


-(e* — e~”) = sin x cosh y + icos x sinh y 
i 


and, furthermore, 
tan z = (sin z)/cos z, cot z = I/tanz, ete. 


The hyperbolic functions are (Sec. 13.6) 


(7)  coshz = 3(e* +e *) = cos iz, sinh z 3(e* e °) = —isiniz 


etc. The functions (5)-(7) are entire, that is, analytic everywhere in the complex 
plane. 
The natural logarithm is (Sec. 13.7) 


(8) In z = In|z| + iargz = In|z| + iArgz + 2nTi 

where z # 0 and n = 0,1,---. Arg z is the principal value of arg z, that is, 
—7 < Arg z S77. We see that In z is infinitely many-valued. Taking n = 0 gives 
the principal value Ln z of In z; thus Ln z = In|z| + i Arg z. 


General powers are defined by (Sec. 13.7) 


3 a gene (c complex, z # 0). 


CHAPTER | 4 


Complex Integration 


Chapter 13 laid the groundwork for the study of complex analysis, covered complex num- 
bers in the complex plane, limits, and differentiation, and introduced the most important 
concept of analyticity. A complex function is analytic in some domain if it is differentiable 
in that domain. Complex analysis deals with such functions and their applications. The 
Cauchy—Riemann equations, in Sec. 13.4, were the heart of Chapter 13 and allowed a means 
of checking whether a function is indeed analytic. In that section, we also saw that analytic 
functions satisfy Laplace’s equation, the most important PDE in physics. 

We now consider the next part of complex calculus, that is, we shall discuss the first 
approach to complex integration. It centers around the very important Cauchy integral 
theorem (also called the Cauchy—Goursat theorem) in Sec. 14.2. This theorem is important 
because it allows, through its implied Cauchy integral formula of Sec. 14.3, the evaluation 
of integrals having an analytic integrand. Furthermore, the Cauchy integral formula shows 
the surprising result that analytic functions have derivatives of all orders. Hence, in this 
respect, complex analytic functions behave much more simply than real-valued functions 
of real variables, which may have derivatives only up to a certain order. 

Complex integration is attractive for several reasons. Some basic properties of analytic 
functions are difficult to prove by other methods. This includes the existence of derivatives 
of all orders just discussed. A main practical reason for the importance of integration in 
the complex plane is that such integration can evaluate certain real integrals that appear 
in applications and that are not accessible by real integral calculus. 

Finally, complex integration is used in connection with special functions, such as 
gamma functions (consult [GenRef1]), the error function, and various polynomials (see 
[GenRef10]). These functions are applied to problems in physics. 

The second approach to complex integration is integration by residues, which we shall 
cover in Chapter 16. 


Prerequisite: Chap. 13. 
Section that may be omitted in a shorter course: 14.1, 14.5. 
References and Answers to Problems: App. | Part D, App. 2. 


14.1 Line Integral in the Complex Plane 


As in calculus, in complex analysis we distinguish between definite integrals and indefinite 
integrals or antiderivatives. Here an indefinite integral is a function whose derivative 
equals a given analytic function in a region. By inverting known differentiation formulas 
we may find many types of indefinite integrals. 

Complex definite integrals are called (complex) line integrals. They are written 


I F(2 dz. 


c 
643 
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Here the integrand f(z) is integrated over a given curve C or a portion of it (an arc, but 
we shall say “curve” in either case, for simplicity). This curve C in the complex plane is 
called the path of integration. We may represent C by a parametric representation 


(1) z(t) = x(t) = iy(2) (aStb). 


The sense of increasing f is called the positive sense on C, and we say that C is oriented 
by (1). 

For instance, z(t) = t + 3it (0 S t S 2) gives a portion (a segment) of the line y = 3x. 
The function z(t) = 4.cost + 4i sin t(—7 3S t S 77) represents the circle lz| = 4, and so 
on. More examples follow below. 

We assume C to be a smooth curve, that is, C has a continuous and nonzero derivative 


; d_, Fe 
at) = = = x) + iy) 


ke 
dt 
at each point. Geometrically this means that C has everywhere a continuously turning 
tangent, as follows directly from the definition 


. . z(t + At) — z(t) fes286 
z(t) = dim, Al (Fig. ). 


Here we use a dot since a prime ’ denotes the derivative with respect to z. 


Definition of the Complex Line Integral 


This is similar to the method in calculus. Let C be a smooth curve in the complex plane 
given by (1), and let f(z) be a continuous function given (at least) at each point of C. We 
now subdivide (we “partition”) the interval a S t S b in (1) by points 


to (= a), th, my tn-1> th (= b) 
where fg < fy < +: < ty. To this subdivision there corresponds a subdivision of C by 
points 
ZO, 21s “*'s Zn—ts Zn (= Z) (Fig. 340), 


Fig. 339. Tangent vector z(t) of a curve C in the Fig. 340. Complex line integral 
complex plane given by z(t). The arrowhead on the 
curve indicates the positive sense (sense of increasing t) 
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where z; = z(t;). On each portion of subdivision of C we choose an arbitrary point, say, 
a point 2; between Zo and z, (that is, 2; = z(t) where f satisfies fg S ¢ S 11), a point fo 
between z, and zg, etc. Then we form the sum 


n 
(2) Sn = = fm) Azm where AZm = 2m — 2m-1- 
m=1 
We do this for each n = 2,3,--- in a completely independent manner, but so that the 
greatest |At,,| = |tm, — tm—1| approaches zero as n> ©. This implies that the greatest 


|Az»| also approaches zero. Indeed, it cannot exceed the length of the arc of C from 
Zm-—1 tO Zm and the latter goes to zero since the arc length of the smooth curve C is a 
continuous function of ¢. The limit of the sequence of complex numbers So, S3,--- thus 
obtained is called the line integral (or simply the integral) of f(z) over the path of 
integration C with the orientation given by (1). This line integral is denoted by 


(3) | F@) dz, or by ; ii@idz 


Cc Cc 


if C is a closed path (one whose terminal point Z coincides with its initial point zo, as 
for a circle or for a curve shaped like an 8). 


General Assumption. All paths of integration for complex line integrals are assumed to 
be piecewise smooth, that is, they consist of finitely many smooth curves joined end to end. 


Basic Properties Directly Implied by the Definition 


1. Linearity. Integration is a linear operation, that is, we can integrate sums term by 
term and can take out constant factors from under the integral sign. This means that 
if the integrals of f; and fg over a path C exist, so does the integral of ky fy + kofo 
over the same path and 


(4) | tA + Ko folz)] dz = k| Si(@) dz + ka| So(z) dz. 


Cc C C 


2. Sense reversal in integrating over the same path, from Zg to Z (left) and from Z to 
Zo (right), introduces a minus sign as shown, 


Z Zo 
(5) | f(2 dz = -| f (2) dz. 


Zo Z 


3. Partitioning of path (see Fig. 341) 


(6) | f@ dz = | faz + | 
Cc 


Cy Cc 


S(@ dz. 


Fig. 341. Partitioning of path [formula (6)] 
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Existence of the Complex Line Integral 


Our assumptions that f(z) is continuous and C is piecewise smooth imply the existence 
of the line integral (3). This can be seen as follows. 
As in the preceding chapter let us write f(z) = u(x, y) + iv(x, y). We also set 


lm = &ém + inm and Azm = Axm + iAym. 
Then (2) may be written 
(7) Sn = Diu + iv)\(Axm + iAym) 


where u = u(lm, Nm), V = UV(Sm; Nm) and we sum over m from | to n. Performing the 
multiplication, we may now split up S,, into four sums: 


Sy = SNueAcny= Sv Aym +i| Sudym + SY vAxnl. 


These sums are real. Since f is continuous, uw and v are continuous. Hence, if we let n 
approach infinity in the aforementioned way, then the greatest Ax,,, and Ay,, will approach 
zero and each sum on the right becomes a real line integral: 


(8) jim Sp = | faa 
Cc 
= |uar— [vay +i[[uay+ [oa 
Cc Cc Cc Cc 


This shows that under our assumptions on f and C the line integral (3) exists and its value 
is independent of the choice of subdivisions and intermediate points ¢,,. |_| 


First Evaluation Method: 
Indefinite Integration and Substitution of Limits 


This method is the analog of the evaluation of definite integrals in calculus by the well- 
known formula 


b 
| f0 dx = F(b) — F(a) 


where [F'(x) = f(x]. 

It is simpler than the next method, but it is suitable for analytic functions only. To 
formulate it, we need the following concept of general interest. 

A domain D is called simply connected if every simple closed curve (closed curve 
without self-intersections) encloses only points of D. 

For instance, a circular disk is simply connected, whereas an annulus (Sec. 13.3) is not 
simply connected. (Explain!) 
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THEOREM 1 


EXAMPLE 1 


EXAMPLE 2 


EXAMPLE-3 


EXAMPLE 4 


THEOREM 2 


Indefinite Integration of Analytic Functions 


Let f(z) be analytic in a simply connected domain D. Then there exists an indefinite 
integral of f(z) in the domain D, that is, an analytic function F(z) such that 
F(z) = f(z) in D, and for all paths in D joining two points zo and z, in D we have 


(9) | Tada Zi) = Fa) [F'@ = f@). 


Zo 


(Note that we can write Zg and z instead of C, since we get the same value for all 
those C from Zg to Z1.) 


This theorem will be proved in the next section. 
Simple connectedness is quite essential in Theorem 1, as we shall see in Example 5. 
Since analytic functions are our main concern, and since differentiation formulas will often 
help in finding F(z) for a given f(z) = F'(z), the present method is of great practical interest. 
If f(z) is entire (Sec. 13.5), we can take for D the complex plane (which is certainly 
simply connected). 


1+i 1t+i 
i 1 2, 2 
2 dz 2 a+ taj | 
Jo alle 3 3 3 
Ti Ti 
cos zdz = sinz = 2 sin Ti = 2i sinh 7 = 23.097i ia 
Nay —Ti 
8-371 8-377 - - 
| el? dz= Ie2/2 = et 87/2 = ett 7/2) =0 
847i 8+ mt 
since e* is periodic with period 277i. B 


i , : 
dz i iv 
| — = Lni — Ln(—i) 5 ( 5 ) ir. Here D is the complex plane without 0 and the negative real 


-t 


axis (where Ln z is not analytic). Obviously, D is a simply connected domain. a 


Second Evaluation Method: 
Use of a Representation of a Path 


This method is not restricted to analytic functions but applies to any continuous complex 
function. 


Integration by the Use of the Path 


Let C be a piecewise smooth path, represented by z = z(t), where a St S b. Let 
(2 be a continuous function on C. Then 


b 
d. 
(10) | F(@) dz = | flz<@\z@ at (2 - “). 


Cc a 
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EXAMPLE 5 


CHAP. 14 Complex Integration 


The left side of (10) is given by (8) in terms of real line integrals, and we show that 
the right side of (10) also equals (8). We have z = x + iy, hence z = x + iy. We simply 
write u for u[x(f), y(‘)] and v for v[x(4), y()]. We also have dx = xdt and dy = y dt. 
Consequently, in (10) 


b 
| flz(f)|z(@) dt (u + iv)(x + iy) dt 


a 


ll 
eo 
o 


| war ody + iudy + van) 
Cc 


| wae vay +i] way + va, |_| 
Cc Cc 


COMMENT. In (7) and (8) of the existence proof of the complex line integral we referred 
to real line integrals. If one wants to avoid this, one can take (10) as a definition of the 
complex line integral. 


Steps in Applying Theorem 2 


(A) Represent the path C in the form z(t) (a St S b). 

(B) Calculate the derivative z(t) = dz/dt. 

(C) Substitute z(t) for every z in f(z) (hence x(f) for x and y(t) for y). 
(D) Integrate f[z(f)]z(4) over t from a to b. 


A Basic Result: Integral of 1/z Around the Unit Circle 


We show that by integrating 1/z counterclockwise around the unit circle (the circle of radius 1 and center 0; 
see Sec. 13.3) we obtain 


(650) ; a = 277i (C the unit circle, counterclockwise). 
@ 


This is a very important result that we shall need quite often. 


Solution. (A) We may represent the unit circle C in Fig. 330 of Sec. 13.3 by 
(ft) = cost + isint = e* (0StS2n), 


so that counterclockwise integration corresponds to an increase of f from 0 to 277. 
(B) Differentiation gives z(t) = ie” (chain rule!). 
(C) By substitution, f(z()) = 1/z() = a" 
(D) From (10) we thus obtain the result 


dz 27 ; , 2r 
pS = | e tie dt = i| dt = 277i. 


z 


Cc 0 0 


Check this result by using z(t) = cost + isint. 

Simple connectedness is essential in Theorem 1. Equation (9) in Theorem | gives 0 for any closed path 
because then z1 = Zo, so that F(z) — F(zo) = 0. Now 1/z is not analytic at z = 0. But any simply connected 
domain containing the unit circle must contain z = 0, so that Theorem 1 does not apply—it is not enough that 
1/z is analytic in an annulus, say, 3 < |z| < 3, because an annulus is not simply connected! | 
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EXAMPLE 6 


EXAMPLE 7 


Integral of 1/z™ with Integer Power m 


Let f(z) = (z — zo)” where m is the integer and zg a constant. Integrate counterclockwise around the circle C 
of radius p with center at zo (Fig. 342). 


x 


Fig. 342. Path in Example 6 


Solution. We may represent C in the form 


z(t) = zo + p(cost + isint) = zo + pe Off 27). 


Then we have 


m jimt 


(z — 20)" = p™e"™, dz = ipe” dt 
and obtain 


27 27 
; (z - Zo)” dz = | prem ipet d= om elhm+ Dt dt. 
Cc 0 0 


By the Euler formula (5) in Sec. 13.6 the right side equals 


20 


2a 
wom | cos (m + 1)tdt+ i| 


sin (m + 1)t a , 
0 0 


m+1 


If m = —1, we have p 1, cos 0 = 1, sin0 = 0. We thus obtain 277i. For integer m # —1 each of the two 
integrals is zero because we integrate over an interval of length 277, equal to a period of sine and cosine. Hence 
the result is 


Ini = (m= —1), 
(12) ; Z— Z0)™ dz = a 
ic 0 (m # —1and integer). 


Dependence on path. Now comes a very important fact. If we integrate a given function 
F(z) from a point zo to a point z1 along different paths, the integrals will in general have 
different values. In other words, a complex line integral depends not only on the endpoints 
of the path but in general also on the path itself. The next example gives a first impression 
of this, and a systematic discussion follows in the next section. 


Integral of a Nonanalytic Function. Dependence on Path 


Integrate f(z) = Re z = x from 0 to | + 2i (a) along C* in Fig. 343, (b) along C consisting of Cy and C2. 


Solution. (a) C* can be represented by z(t) = t + 2it(0 StS 1). Hence 2(t) = 1 + 2i and f[z(t)] = 
x(t) = t on C*. We now calculate 


. 1 1 
Rezdz 1(1 + 2i) dt 1+2i bi. 
[3 [a ar =a + 29-1 
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Fig. 343. Paths in Example 7 


(b) We now have 


Cy: z(t) = t, 2(t) = 1, S(e)) = x) = t OsStsSl) 
Co: z(t) = 1 + it, 2(t) = i, SQ) = x) = 1 Ost 2). 


Using (6) we calculate 
1 2 1 
[Rezae= | Rezde+ | Reede= [rat | ria =442% 
Cc c C. 0 0 2 
Note that this result differs from the result in (a). B 


Bounds for Integrals. ML-Inequality 


There will be a frequent need for estimating the absolute value of complex line integrals. 
The basic formula is 


(13) = ML (ML-inequality); 


| Ff (2) dz 
c 


L is the length of C and M a constant such that | f(z)| S M everywhere on C. 


Taking the absolute value in (2) and applying the generalized inequality (6*) in Sec. 13.2, 
we obtain 


m=1 m1 


> £Gm) Azm 


m1 


Sp] = 


Now |Az,| is the length of the chord whose endpoints are z,,—1 and z,, (see Fig. 340). 
Hence the sum on the right represents the length L* of the broken line of chords whose 
endpoints are Z9, Z1,°°*, Zn (= Z). Ifn approaches infinity in such a way that the greatest 
| At;,| and thus |Az,,| approach zero, then L* approaches the length L of the curve C, by 
the definition of the length of a curve. From this the inequality (13) follows. (| 


We cannot see from (13) how close to the bound ML the actual absolute value of the 
integral is, but this will be no handicap in applying (13). For the time being we explain 
the practical use of (13) by a simple example. 
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EXAMPLE 8 


1 


Fig. 344. Path in 
Example 8 


Estimation of an Integral 


Find an upper bound for the absolute value of the integral 


| 2 dz, C the straight-line segment from 0 to | + 7, Fig. 344. 
c 


Solution. L= V2 and If(@| = |z2| S=2onC gives by (13) 


[2 dz 


Cc 


S2V2 = 2.8284. 


The absolute value of the integral is |-2 a 23 = 3 V2 = 0.9428 (see Example 1). 3] 


Summary on Integration. Line integrals of f(z) can always be evaluated by (10), using 
a representation (1) of the path of integration. If f(z) is analytic, indefinite integration by 


(9) as in calculus will be simpler (proof in the next section). 


PROBLEEM—SET 14-1 


FIND THE PATH and sketch it. 
.2n=A1+sidt (2515) 
.2)=3+i+0-ad)t OStS3) 
aa) =t+ 2 ( StS2) 
aq)=tt+(-A)7i (-1St2=1) 
of) =3—-i+ Vl0e"* (0St S27) 
ay)=1l+ite™ O12) 
zt) = 2+ 4e™? (S752) 

. t)=5e* OStS 7/2) 

. 2) =t+it® (2512) 

. 2t) = 2cost+isnt (OStS 27) 


im 
—_ 
i) 


ee IAN awry 


— 
—] 


FIND A PARAMETRIC REPRESENTATION 
and sketch the path. 

11. Segment from (—1, 1) to (1, 3) 

12. From (0, 0) to (2, 1) along the axes 

13. Upper half of |z — 2 + i] = 2 from (4, —1) to (0, —1) 
14, Unit circle, clockwise 

15. x? — 4y” = 4, the branch through (2, 0) 

16. Ellipse 4x2 + 9y? = 36, counterclockwise 

17. |z +. a + ib| = r, clockwise 

18. y = 1/x from (1, 1) to (5,3) 

19. Parabola y = 1 — 4x? (-23%52) 

20. 4(x — 2)? + 5(y + 1)? = 20 


21-30| INTEGRATION 


Integrate by the first method or state why it does not apply 
and use the second method. Show the details. 


21. | Re z dz, C the shortest path from 1 + i to 3 + 3i 
Cc 


22. 


23. 


24. 


25. 


27. 


28. 


29. 


30. 


31. 


Re zdz, C the parabola y = 1 + d(x — 1)? from 


Q 


1+ ito3 + 3i 


| e* dz, C the shortest path from 7ri to 277i 
Cc 


cos 2z dz, C the semicircle |z| = 7,x 20 from 


Q 


—Ti to Ti 


| zexp (2?) dz, C from 1 along the axes to i 
Cc 


. | (z + z_}) dz, C the unit circle, counterclockwise 
c 


| sec” z dz, any path from 77/4 to 7i/4 
c 


\f 5 5) de, Cthecincte |e — 2il = 4 
’ z—2i (= 2) 


clockwise 


| Im z? dz counterclockwise around the triangle with 
Cc 
vertices 0, 1, i 


| Re 2 dz clockwise around the boundary of the square 
Cc 

with vertices 0,7, 1 + i, 1 

CAS PROJECT. Integration. Write programs for the 
two integration methods. Apply them to problems of 
your choice. Could you make them into a joint program 
that also decides which of the two methods to use in a 
given case? 
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32. 


33. 


34. 
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Sense reversal. Verify (5) for f(z) = z7, where C is 
the segment from —1 — ito 1 +7. 


Path partitioning. Verify (6) for f(z) = 1/z and Cy, 
and C2 the upper and lower halves of the unit circle. 


TEAM EXPERIMENT. Integration. (a) Comparison. 
First write a short report comparing the essential points 
of the two integration methods. 


(b) Comparison. Evaluate | f(@ dz by Theorem 1 
c 
and check the result by Theorem 2, where: 


(i) f(2) = z* and C is the semicircle |z| = 2 from 
—2i to 27 in the right half-plane, 


(ii) f(z) = e* and C is the shortest path from 0 to 
1 + 2i. 

(c) Continuous deformation of path. Experiment 
with a family of paths with common endpoints, say, 
at) = t+ iasint,0 St=7, with real parameter a. 
Integrate nonanalytic functions (Re z, Re (2), etc.) and 
explore how the result depends on a. Then take analytic 
functions of your choice. (Show the details of your 
work.) Compare and comment. 
(d) Continuous deformation of path. Choose another 
family, for example, semi-ellipses z(t) = acost + 
isin t, —7/2 St S 7/2, and experiment as in (c). 


35. ML-inequality. Find an upper bound of the absolute 


value of the integral in Prob. 21. 


14.2 Cauchy’s Integral Theorem 


This section is the focal point of the chapter. We have just seen in Sec. 14.1 that a line 
integral of a function f(z) generally depends not merely on the endpoints of the path, but 
also on the choice of the path itself. This dependence often complicates situations. Hence 
conditions under which this does not occur are of considerable importance. Namely, if 
f(z) is analytic in a domain D and D is simply connected (see Sec. 14.1 and also below), 
then the integral will not depend on the choice of a path between given points. This result 
(Theorem 2) follows from Cauchy’s integral theorem, along with other basic consequences 
that make Cauchy’s integral theorem the most important theorem in this chapter and 
fundamental throughout complex analysis. 

Let us continue our discussion of simple connectedness which we started in Sec. 14.1. 


1. A simple closed path is a closed path (defined in Sec. 14.1) that does not intersect 
or touch itself as shown in Fig. 345. For example, a circle is simple, but a curve 


shaped like an 8 is not simple. 


CAE EE 


Simple 


Fig. 345. 


Not simple Not simple 


Closed paths 


2. A simply connected domain D in the complex plane is a domain (Sec. 13.3) such 
that every simple closed path in D encloses only points of D. Examples: The interior 
of a circle (“open disk”), ellipse, or any simple closed curve. A domain that is not 
simply connected is called multiply connected. Examples: An annulus (Sec. 13.3), 
a disk without the center, for example, 0 < Iz] < 1. See also Fig. 346. 

More precisely, a bounded domain D (that is, a domain that lies entirely in some 
circle about the origin) is called p-fold connected if its boundary consists of p closed 


SEC. 14.2. Cauchy’s Integral Theorem 653 


THEOREM 1 


EXAMPLE 1 


EXAMPLE 2 


GCOS 


Simply Simply Doubly Triply 
connected connected connected connected 


Fig. 346. Simply and multiply connected domains 


connected sets without common points. These sets can be curves, segments, or single 
points (such as z = 0 for0 < lz] < 1, for which p = 2). Thus, D has p — | “holes,” 
where “hole” may also mean a segment or even a single point. Hence an annulus 
is doubly connected (p = 2). 


Cauchy’s Integral Theorem 


If f(z) is analytic in a simply connected domain D, then for every simple closed path 
C in D, 


(1) £0 da—0! See Fig. 347. 
Cc 
- Serie " 
- y 
oo » 
ait \ 
aia ] 
oc / 
&é a 
CG 7 
\ D ~ ” 
‘SS _ aa, 


Fig. 347. Cauchy’s integral theorem 


Before we prove the theorem, let us consider some examples in order to really understand 
what is going on. A simple closed path is sometimes called a contour and an integral over 
such a path a contour integral. Thus, (1) and our examples involve contour integrals. 


Entire Functions 
} ede =0, } cos zdz = 0, fet ae=0 (n = 0, 1,---) 
c c c 


for any closed path, since these functions are entire (analytic for all z). ia] 


Points Outside the Contour Where f(x) is Not Analytic 


dz 
sec zdz = 0, =0 
Cc czt+4 


where C is the unit circle, sec z = 1/cos z is not analytic at z = 47/2, +377/2,---, but all these points lie 
outside C; none lies on C or inside C. Similarly for the second integral, whose integrand is not analytic at 
Zz = £2: outside C. | 
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EXAMPLE 4 


EXAMPLE 5 


PROOF 
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Nonanalytic Function 


where C: z(t) = e” is the unit circle. This does not contradict Cauchy’s theorem because f(z) = Zz is not 
analytic. 


Analyticity Sufficient, Not Necessary 


dz 
—=0 
c 22 


where C is the unit circle. This result does not follow from Cauchy’s theorem, because f(z) = 1/ 2? is not analytic 
atz = 0. Hence the condition that f be analytic in D is sufficient rather than necessary for (1) to be true. & 


Simple Connectedness Essential 


dz 
— 277i 
Cc 


xz 


for counterclockwise integration around the unit circle (see Sec. 14.1). C lies in the annulus 3 <[z| < 3 where 
1/z is analytic, but this domain is not simply connected, so that Cauchy’s theorem cannot be applied. Hence the 
condition that the domain D be simply connected is essential. 

In other words, by Cauchy’s theorem, if f(z) is analytic on a simple closed path C and everywhere inside C, 
with no exception, not even a single point, then (1) holds. The point that causes trouble here is z = 0 where 1/z 
is not analytic. ia 


Cauchy proved his integral theorem under the additional assumption that the derivative 
f'(2 is continuous (which is true, but would need an extra proof). His proof proceeds as 
follows. From (8) in Sec. 14.1 we have 


$ fo de = ude vd) + i$ tudy + vedo) 
Cc Cc Cc 


Since f(z) is analytic in D, its derivative f ‘(z) exists in D. Since i ‘(z) is assumed to be 
continuous, (4) and (5) in Sec. 13.4 imply that u and v have continuous partial derivatives 
in D. Hence Green’s theorem (Sec. 10.4) (with uw and —v instead of F' and F9) is applicable 


and gives 
0 0 
pwd oay= |] ( ° “Vaca 
Cc ox oy 


R 


where R is the region bounded by C. The second Cauchy—Riemann equation (Sec. 13.4) 
shows that the integrand on the right is identically zero. Hence the integral on the left is 
zero. In the same fashion it follows by the use of the first Cauchy—Riemann equation that 
the last integral in the above formula is zero. This completes Cauchy’s proof. a 


Goursat’s proof without the condition that f(z) is continuous' is much more complicated. 
We leave it optional and include it in App. 4. 


'EDOUARD GOURSAT (1858-1936), French mathematician who made important contributions to complex 
analysis and PDEs. Cauchy published the theorem in 1825. The removal of that condition by Goursat (see Transactions 
Amer. Math Soc., vol. 1, 1900) is quite important because, for instance, derivatives of analytic functions are also 
analytic. Because of this, Cauchy’s integral theorem is also called Cauchy—Goursat theorem. 
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THEOREM 2 


PROOF 


Independence of Path 


We know from the preceding section that the value of a line integral of a given function 
f(z) from a point zz to a point zg will in general depend on the path C over which we 
integrate, not merely on z, and Zg. It is important to characterize situations in which this 
difficulty of path dependence does not occur. This task suggests the following concept. 
We call an integral of f(z) independent of path in a domain D if for every z1, z2 in D 
its value depends (besides on f(z), of course) only on the initial point z1 and the terminal 
point zg, but not on the choice of the path C in D [so that every path in D from z, to zg 
gives the same value of the integral of f(z)]. 


Independence of Path 


Tf f(z) is analytic in a simply connected domain D, then the integral of f(z) is 
independent of path in D. 


Let z, and Zz be any points in D. Consider two paths Cy and C2 in D from z, to zz without 
further common points, as in Fig. 348. Denote by Cz the path Cy. with the orientation 
reversed (Fig. 349). Integrate from z, over Cy to zz and over Cy back to z,. This is a 
simple closed path, and Cauchy’s theorem applies under our assumptions of the present 
theorem and gives zero: 


(2') | sac+ | fac=0, thus | sac=-| f dz. 


Cy Cz Cy C3 


But the minus sign on the right disappears if we integrate in the reverse direction, from 
Z 1 to zg, which shows that the integrals of f(z) over Cy and Co are equal, 


(2) | f(2) dz = | 


f(@ az (Fig. 348). 
Ci Co 


This proves the theorem for paths that have only the endpoints in common. For paths that 
have finitely many further common points, apply the present argument to each “loop” 
(portions of C, and Cz between consecutive common points; four loops in Fig. 350). For 
paths with infinitely many common points we would need additional argumentation not 
to be presented here. 


Fig. 348. Formula (2) Fig. 349. Formula (2') Fig. 350. Paths with more 
common points 
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Principle of Deformation of Path 


This idea is related to path independence. We may imagine that the path C2 in (2) was 
obtained from C, by continuously moving C (with ends fixed!) until it coincides with 
Co. Figure 351 shows two of the infinitely many intermediate paths for which the integral 
always retains its value (because of Theorem 2). Hence we may impose a continuous 
deformation of the path of an integral, keeping the ends fixed. As long as our deforming 
path always contains only points at which f(z) is analytic, the integral retains the same 
value. This is called the principle of deformation of path. 


Fig. 351. Continuous deformation of path 


A Basic Result: Integral of Integer Powers 
From Example 6 in Sec. 14.1 and the principle of deformation of path it follows that 
27i (m= —1) 
(3) f z— Zo)" dz = 
0 (m # —1 and integer) 


for counterclockwise integration around any simple closed path containing Zo in its interior. 
Indeed, the circle |z — zo] = p in Example 6 of Sec. 14.1 can be continuously deformed in two steps into a path 
as just indicated, namely, by first deforming, say, one semicircle and then the other one. (Make a sketch). =] 


Existence of Indefinite Integral 


We shall now justify our indefinite integration method in the preceding section [formula 
(9) in Sec. 14.1]. The proof will need Cauchy’s integral theorem. 


Existence of Indefinite Integral 

Tf f(z) is analytic in a simply connected domain D, then there exists an indefinite 
integral F(z) of f(z) in D—thus, F @= f(2—which is analytic in D, and for all 
paths in D joining any two points Z9 and z in D, the integral of f(z) from zg to 24 
can be evaluated by formula (9) in Sec. 14.1. 


The conditions of Cauchy’s integral theorem are satisfied. Hence the line integral of f(z) 
from any Zg in D to any z in D is independent of path in D. We keep zo fixed. Then this 
integral becomes a function of z, call if F(z), 


(4) F(z) = | f(z*) dz* 
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which is uniquely determined. We show that this F(z) is analytic in D and F (Z= Ff). 
The idea of doing this is as follows. Using (4) we form the difference quotient 


z+Az z z+Az 
Fg + Ad - Fe) _ 1 | : -— | r ‘|-2] “ae 
(5) he ec A F(Z*) dz F(Z*) dz Az | LR) dz*. 


20 


We now subtract f(z) from (5) and show that the resulting expression approaches zero as 
Az— 0. The details are as follows. 

We keep z fixed. Then we choose z + Azin D so that the whole segment with endpoints 
zand z + Azis in D (Fig. 352). This can be done because D is a domain, hence it contains 
a neighborhood of z. We use this segment as the path of integration in (5). Now we subtract 
f(z). This is a constant because z is kept fixed. Hence we can write 


z+Az z+Az 


z+Az 
| F (2) dz* =| dz* = f(z) Az. Thus f@ = ~ | F(z) dz*. 
Zz Zz z 


z 


By this trick and from (5) we get a single integral: 


F(z + Az) — F(z) 
Az 


z+hz 
f@ = | [ f(z*) — f(z] dz*. 
Az 


z 


Since f(z) is analytic, it is continuous (see Team Project (24d) in Sec. 13.3). An e > 0 
being given, we can thus find a 6 > 0 such that | f(z*) — f(z)| < € when |z* — z| < 6. 
Hence, letting |Az| < 5, we see that the ML-inequality (Sec. 14.1) yields 


F(z + Az) — F(z) 
Az 


1 
= e|Az| = e. 


| Az| 


z+Az 
1 
i) = | Lf(z*) — f(z)] de® 
[Az] IJ, 


By the definition of limit and derivative, this proves that 


F(z + Az) — F(z) 
Az 7 


f(2). 


! e 
oe 


Since z is any point in D, this implies that F(z) is analytic in D and is an indefinite integral 
or antiderivative of f(z) in D, written 


F(z) = | 10 dz. 


Fig. 352. Path of integration 
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Also, if G'(z) = f(z), then F(z) — Gz) = 0 in D; hence F(z) — G(z) is constant in D 
(see Team Project 30 in Problem Set 13.4). That is, two indefinite integrals of f(z) can 
differ only by a constant. The latter drops out in (9) of Sec. 14.1, so that we can use any 
indefinite integral of f(z). This proves Theorem 3. (| 


Cauchy’s Integral Theorem 
for Multiply Connected Domains 


Cauchy’s theorem applies to multiply connected domains. We first explain this for a 
doubly connected domain D with outer boundary curve Cy and inner C2 (Fig. 353). If 
a function f(z) is analytic in any domain D* that contains D and its boundary curves, we 
claim that 


(6) ; f(@dz= ; F(2@) dz (Fig. 353) 
Cc 


1 Cy 


both integrals being taken counterclockwise (or both clockwise, and regardless of whether 
or not the full interior of Cz belongs to D*). 


Cc, 


Fig. 353. Paths in (5) 


By two cuts Gi and ee (Fig. 354) we cut D into two simply connected domains D, and 
Dg in which and on whose boundaries f(z) is analytic. By Cauchy’s integral theorem the 
integral over the entire boundary of D, (taken in the sense of the arrows in Fig. 354) is 
zero, and so is the integral over the boundary of De, and thus their sum. In this sum the 
integrals over the cuts C, and Cz cancel because we integrate over them in both 
directions—this is the key—and we are left with the integrals over Cy (counterclockwise) 
and Cy (clockwise; see Fig. 354); hence by reversing the integration over C2 (to 
counterclockwise) we have 


p sac—$ fae =0 
Cc C 


1 2 


and (6) follows. | 


For domains of higher connectivity the idea remains the same. Thus, for a triply connected 
domain we use three cuts ca Cs, GC (Fig. 355). Adding integrals as before, the integrals 
over the cuts cancel and the sum of the integrals over Cy (counterclockwise) and C2, C3 
(clockwise) is zero. Hence the integral over Cy equals the sum of the integrals over C2 
and Cs, all three now taken counterclockwise. Similarly for quadruply connected domains, 
and so on. 
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Fig. 354. Doubly connected domain 


PROBLEEM—SET 14-2 


1-8| COMMENTS ON TEXT AND EXAMPLES 


1. Cauchy’s Integral Theorem. Verify Theorem 1 for 
the integral of z? over the boundary of the square with 
vertices +1 + i. Hint. Use deformation. 


2. For what contours C will it follow from Theorem | that 


dz exp (1/2?) 
(a) | S=o. (b) |S ie- 
em pi + 16 
3. Deformation principle. Can we conclude from 
Example 4 that the integral is also zero over the contour 
in Prob. 1? 


4. If the integral of a function over the unit circle equals 
2 and over the circle of radius 3 equals 6, can the 
function be analytic everywhere in the annulus 
1 < |z| <3? 


5. Connectedness. What is the connectedness of the 
domain in which (cos 2)/ (z* + 1) is analytic? 

6. Path independence. Verify Theorem 2 for the integral 
of e* from 0 to 1 + i (a) over the shortest path and 
(b) over the x-axis to 1 and then straight up to 1 + i. 


7. Deformation. Can we conclude in Example 2 that 
the integral of (2 + 4) over (a) |z — 2| = 2 and 
(b) |z — 2| = 3 is zero? 

8. TEAM EXPERIMENT. Cauchy’s Integral Theorem. 


(a) Main Aspects. Each of the problems in Examples 
1-5 explains a basic fact in connection with Cauchy’s 
theorem. Find five examples of your own, more 
complicated ones if possible, each illustrating one of 
those facts. 


(b) Partial fractions. Write f(z) in terms of partial 
fractions and integrate it counterclockwise over the unit 
circle, where 


. 2z + 3i = zt+l1 
® fo=> > WW feo= 
z+] Zz + 2z 


(c) Deformation of path. Review (c) and (d) of Team 
Project 34, Sec. 14.1, in the light of the principle of defor- 
mation of path. Then consider another family of paths 
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C, 


Fig. 355. Triply connected domain 


with common endpoints, say, z(f) = ¢ + ia(t — t?), 
0 =t = 1, a areal constant, and experiment with the 
integration of analytic and nonanalytic functions of 
your choice over these paths (e.g., z, Im z, 22, Re 22, 
In 22, etc.). 


9-19| CAUCHY’S THEOREM APPLICABLE? 


Integrate f(z) counterclockwise around the unit circle. 
Indicate whether Cauchy’s integral theorem applies. Show 
the details. 


9. f(z) = exp (—z”) 
11. f@ = 1/2z- 1) 
13. f(@ = 1/(e* - 1.1) 
15. f(2) = Imz 
17. f(D = Iz? 

19. f(z) = 2 cot z 


10. f(z) = tan Zz 

12. f(2) =z? 

14. f(z) = 1/z 

16. f(z) = 1/(mz - 1) 
18. f(z) = 1/(4z — 3) 


20-30 | FURTHER CONTOUR INTEGRALS 


Evaluate the integral. Does Cauchy’s theorem apply? Show 
details. 


20. f Ln (1 — z) dz, C the boundary of the parallelogram 
Cc 


with vertices +i, +(1 + i). 


z ‘ : 

21. ; -, C the circle |z| = 7 counterclockwise. 

z— 3i 
Cc 


22. ; Rezdz, C: 
Cc 


Use partial fractions. 
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dz y COS Z . ; 
24. ; aa C: . 27. — dz, C consists of |z] = 1 counterclockwise 
| c ” 
and |z| = 3 clockwise. 
x 

tan dz 
28. 7 dz, C the boundary of the square with 

cz — 16 

Use partial fractions. vertices +1, +7 clockwise. 


z 
e ; ; 
25; ; —dz, C consists of |z| = 2 counterclockwise and 
j 
Cc 


sin z 
29. ; dz, C:|z —4— 2i| = 5.5 clockwise. 
Cc 


z+ 2iz 
=) : 223 + 7744 
|<] = 1 clockwise. 30. ; ———-dz, C:|z — 2| = 4 clockwise. Use 
4 2 
c z +4z 
26. ; coth z dz, C the circle |z — ai | = 1 clockwise. partial fractions. 
Cc 


14.3 Cau 


THEOREM 1 


PROOF 


chy’s Integral Formula 


Cauchy’s integral theorem leads to Cauchy’s integral formula. This formula is useful for 
evaluating integrals as shown in this section. It has other important roles, such as in proving 
the surprising fact that analytic functions have derivatives of all orders, as shown in the 
next section, and in showing that all analytic functions have a Taylor series representation 
(to be seen in Sec. 15.4). 


Cauchy’s Integral Formula 


Let f(z) be analytic in a simply connected domain D. Then for any point zo in D 
and any simple closed path C in D that encloses zo (Fig. 356), 


(2) 
@) ; = Zoe 2 Fo) (Cauchy’s integral formula) 
(oj 


the integration being taken counterclockwise. Alternatively (for representing f(zo) 
by a contour integral, divide (1) by 277i), 


(1*) iE = = ; uh dz (Cauchy’s integral formula). 
2m JZ — Zo 


By addition and subtraction, f(z) = f(zo) + Lf(z) — f(Zo)]. Inserting this into (1) on the 
left and taking the constant factor f(z) out from under the integral sign, we have 


d — 
(2) ; I@ 4. = #9) ; a ; f@ — FG) 
Cc Cc Cc 


Z— Zo Z— 20 Z— Zo 


The first term on the right equals f(z9) - 277i, which follows from Example 6 in Sec. 14.2 
with m = —1. If we can show that the second integral on the right is zero, then it would 
prove the theorem. Indeed, we can. The integrand of the second integral is analytic, except 
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at Zo. Hence, by (6) in Sec. 14.2, we can replace C by a small circle K of radius p and 
center Zg (Fig. 357), without altering the value of the integral. Since f(z) is analytic, it is 
continuous (Team Project 24, Sec. 13.3). Hence, an e > 0 being given, we can find a 
5 > 0 such that | f(z) — f(zo)| < € for all z in the disk |z — zo| < &. Choosing the radius 
p of K smaller than 6, we thus have the inequality 


a 
-- T, 


- x 
Pa sy 
at \ 
\ 
\ 
, D | 
y ' 
/ J 
t Af O 
1 
1 y” 
ike _— 
x a C 
———— 
Fig. 356. Cauchy’s integral formula Fig. 357. Proof of Cauchy’s integral formula 
f@ — fGo)|_ 
< 
Z— Zo p 


at each point of K. The length of K is 27rp. Hence, by the ML-inequality in Sec. 14.1, 


z — £0 = 


| f f(2) — fo) , 


€ 
<—27rp = 27re. 
p p 


Since € (> 0) can be chosen arbitrarily small, it follows that the last integral in (2) must 
have the value zero, and the theorem is proved. a 


EXAMPLE 1. Cauchy’s Integral Formula 


& 
; dz = 2tie*| = 27ie = 46.4268: 
c Z=2 z=2 


for any contour enclosing zg = 2 (since e* is entire), and zero for any contour for which zo = 2 lies outside 
(by Cauchy’s integral theorem). i] 


EXAMPLE 2. Cauchy’s Integral Formula 


= 2mi[ze* — 3]le-12 


= — 677i (zo = 3i inside C). a 


EXAMPLE 3 __ Integration Around Different Contours 


Integrate 
+1 2+ 


MO 24 este 


counterclockwise around each of the four circles in Fig. 358. 
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Solution.  g(z) is not analytic at —1 and 1. These are the points we have to watch for. We consider each 
circle separately. 

(a) The circle |z — 1| = 1 encloses the point zg = 1 where g(z) is not analytic. Hence in (1) we have to 
write 


ati 2g+1 1 


zZ 3 
s@) 2-1 z+12z-1 
thus 
’ +1 
f@ = ztl 
and (1) gives 
potty 27rif(1) = 2 2 : 277i 
dz Ti, Ti Ti. 
ge =i y eae 


(b) gives the same as (a) by the principle of deformation of path. 


(c) The function g(z) is as before, but f(z) changes because we must take z9 = —1 (instead of 1). This gives 
a factor z — z9 = z + 1| in (1). Hence we must write 


ee a 1 
a z-1lzt1’ 
thus 
+1 
f@) = 
z= 1 


Compare this for a minute with the previous expression and then go on: 


+1 a+ 
7 dz = 27rif(—1) = 2771 | = —277i. 
Ce. ol Z- 1 Jpo-1 


(d) gives 0. Why? aa] 


Fig. 358. Example 3 


Multiply connected domains can be handled as in Sec. 14.2. For instance, if f(z) is 
analytic on C, and Cg and in the ring-shaped domain bounded by C; and Cz (Fig. 359) 
and Zg is any point in that domain, then 


(3) flee) = ; fO a 4 = IO 


Til, Z— Zo 27 
Cy 
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where the outer integral (over C;) is taken counterclockwise and the inner clockwise, as 
indicated in Fig. 359. 


fe) 
Co) 
1 


Fig. 359. Formula (3) 


PROBLEM SET 14-3 


1-4} CONTOUR INTEGRATION cosh (27 — ari) 
a . 15. ¢ —————— dz, C the boundary of the square 
Integrate z“/(z~ — 1) by Cauchy’s formula counterclockwise lc 277 
around the circle. with vertices +2, +2, +4i. 
1. |zt+1)=1 2. |z-1-i| = 7/2 iis 
3. |z +i] = 1.4 4. |z+5-5i| =7 16. ; ; des C the boundary of the triangle with 
oe 
Cc 
5-8 | Integrate the given function around the unit circle. vertices 0 and +1 + 2i. 
5. (cos 3z)/(6z) 6. e**/(az — i) iG ej 
7. /(2z -i) 8. (22 sin D(z = 1) 17. 3 24 dz, C: Iz = il =14 
z 
9. CAS EXPERIMENT. Experiment to find out to what . 
extent your CAS can do contour integration. For this, sin Z . tea teat 
use (a) the second method in Sec. 14.1 and (b) Cauchy’s 18. " Ae? = Si, dz, C consists of the boundaries of the 
integral formula. squares with vertices +3, +37 counterclockwise and 
10. TEAM PROJECT. Cauchy’s Integral Theorem. +1, +i clockwise (see figure). 
Gain additional insight into the proof of Cauchy’s 
integral theorem by producing (2) with a contour 
enclosing Zo (as in Fig. 356) and taking the limit as in y 
the text. Choose 3i 
3 F 
Zz 6 sin z 
(a) ; 7 & (b) ; 7 & 
cit Bl (ome Le 
and (c) another example of your choice. 3 3 x 
11-19} FURTHER CONTOUR INTEGRALS 
Integrate counterclockwise or as indicated. Show the 13; 
details. 
dz Problem 18 
val ; , C427 4+ (y - 27 =4 
qx +4 
2 
z : : eXp Z 
12. ; > dz, C the circle with center —1 and 19. f >= & «C consists of |z] = 2 counter- 
ee tae +3 cz(z-1-—i) 
radius 2 clockwise and |z| = 1 clockwise. 
z 
13. d: zh = = = : 
2 , aD z Cle | 20. Show that (z — 21) 1(z — za) 1 dz = 0 fora simple 
Cc 


14. ; aoe C:|z| = 0.6 closed path C enclosing z, and ze, which are 
lo ze" — 2iz arbitrary. 
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14.4 Derivatives of Analytic Functions 


THEOREM 1 


PROOF 


As mentioned, a surprising fact is that complex analytic functions have derivatives of all 
orders. This differs completely from real calculus. Even if a real function is once 
differentiable we cannot conclude that it is twice differentiable nor that any of its higher 
derivatives exist. This makes the behavior of complex analytic functions simpler than real 
functions in this aspect. To prove the surprising fact we use Cauchy’s integral formula. 


Derivatives of an Analytic Function 


If f(z) is analytic in a domain D, then it has derivatives of all orders in D, which 
are then also analytic functions in D. The values of these derivatives at a point zo 
in D are given by the formulas 


' ! 1 f(@) 
1 = ; dz 
eo 1 Xo) 277i (z — zo)” 

”" ” 2! f(2) 
oe POO = oi ; (@— zo) 
and in general 

(n) a f@ = nae 

(1) Vine Zo) ae ; Cane dz (n = 1,2,---); 


here C is any simple closed path in D that encloses zg and whose full interior belongs 
to D; and we integrate counterclockwise around C (Fig. 360). 


s ? 
SS 


Fig. 360. Theorem 1 and its proof 


COMMENT. For memorizing (1), it is useful to observe that these formulas are obtained 
formally by differentiating the Cauchy formula (1*), Sec. 14.3, under the integral sign 
with respect to Zg. 


We prove (1’), starting from the definition of the derivative 


FoF A) ~— feo 
Az ; 


, 
f@o) = jim, 
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On the right we represent f(zg + Az) and f(zo) by Cauchy’s integral formula: 


Az 27riAz z— (zg + Az) Z— Zo 


f@o + Az) — fo) I { f@ f f@) | 
= d dz |. 
Cc Cc 
We now write the two integrals as a single integral. Taking the common denominator 
gives the numerator f(z){z — zo — [z — (zo + Az)]} = f(z) Az, so that a factor Az drops 
out and we get 


f(zo + Az) — fo) 1 ; F(2) 
= Iz 

Az 2nt Jo @— Ze — ADE — Za) 
Clearly, we can now establish (1’) by showing that, as Az— 0, the integral on the right 
approaches the integral in (1’). To do this, we consider the difference between these two 
integrals. We can write this difference as a single integral by taking the common 
denominator and simplifying the numerator (as just before). This gives 


; f(@ ; {@ ; f(z) Az 
dz 3 dz = 3 dz. 
c &% — Zo — Azz — Zo) c & — Zo) co %— Zo — Azz — Zo) 


We show by the ML-inequality (Sec. 14.1) that the integral on the right approaches zero 
as Az—>0. 

Being analytic, the function f(z) is continuous on C, hence bounded in absolute value, 
say, | f(2)| = K. Let d be the smallest distance from zo to the points of C (see Fig. 360). 
Then for all z on C, 


Iz - Zol” = a*, hence ———_— = : 
Iz - zol” da” 


Furthermore, by the triangle inequality for all z on C we then also have 


dS |z— zol = |z — zo — Az + Az] S lz — zp — Az| + |Adl. 
We now subtract |Az| on both sides and let |Az| S d/2, so that —|Az| = —d/2. Then 


1 2 
=-—, 
lz -—zqo — Azl d 


3d =d-—|Az| S|z—-z — Adl. Hence 


Let L be the length of C. If |Az| S d/2, then by the ML-inequality 


2s 
@2 


; f@ Az dz| = KL |Az| a 
c(z — Zo — Azz — Zo)” d 


This approaches zero as Az— 0. Formula (1’) is proved. 

Note that we used Cauchy’s integral formula (1*), Sec. 14.3, but if all we had known 
about f(z) is the fact that it can be represented by (1*), Sec. 14.3, our argument would 
have established the existence of the derivative f (zo) of f(z). This is essential to the 
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continuation and completion of this proof, because it implies that (1”) can be proved by 
a similar argument, with f replaced by f’, and that the general formula (1) follows by 
induction. o 


Applications of Theorem 1 


Evaluation of Line Integrals 


From (1’), for any contour enclosing the point zi (counterclockwise) 


= —27i sin Ti = 27 sinh 7. |_| 


z=77t 


COs Z : 
; ~; dz = 27i(cos z)' 
c (Z — Ti) 


From (1”), for any contour enclosing the point —i we obtain by counterclockwise integration 


dz = Tri(z* — 3z7 + 6)" ari[12z" — 6],—-; 187i. | 


z=-t 


pests 
oc (tiP 


By (1’), for any contour for which 1 lies inside and +2 lie outside (counterclockwise), 


bcaaneea®2"(a54) 
iz 7 
co — Ie +4) \244 


e(z? + 4) — 6722 
TL 
(22 oe 4) 


z=1 


6. 
= = i ~ 2.0501. a 


Cauchy’s Inequality. Liouville’s and Morera’s Theorems 


We develop other general results about analytic functions, further showing the versatility 
of Cauchy’s integral theorem. 


Cauchy’s Inequality. Theorem | yields a basic inequality that has many applications. 


To get it, all we have to do is to choose for C in (1) a circle of radius r and center zo and 
apply the ML-inequality (Sec. 14.1); with |f(z)| S M on C we obtain from (1) 


; f(2 i 
cl 


feo = 2 


277 


This gives Cauchy’s inequality 


niM 


n 
r 


(2) lf"zo)l| S 


To gain a first impression of the importance of this inequality, let us prove a famous 
theorem on entire functions (definition in Sec. 13.5). (For Liouville, see Sec. 11.5.) 


Liouville’s Theorem 


Tf an entire function is bounded in absolute value in the whole complex plane, then 
this function must be a constant. 


SEC. 14.4 Derivatives of Analytic Functions 


PROOF 


THEOREM 3 


PROOF 
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By assumption, |f(z)| is bounded, say, |f(z)| < K for all z. Using (2), we see that 
| f ‘(zo)| < K/r. Since f(z) is entire, this holds for every r, so that we can take r as large 
as we please and conclude that f (zo) = 0. Since zo is arbitrary, f z) = uy + ivy = 0 for 
all z (see (4) in Sec. 13.4), hence uz = vz = 0, and uy = vy = O by the Cauchy—Riemann 
equations. Thus uw = const, v = const, and f = u + iv = const for all z. This completes 
the proof. a 


Another very interesting consequence of Theorem | is 


Morera’s” Theorem (Converse of Cauchy’s Integral Theorem) 


If f(z) is continuous in a simply connected domain D and if 
(3) ; f(@@ dz =0 
Cc 


for every closed path in D, then f(z) is analytic in D. 


In Sec. 14.2 we showed that if f(z) is analytic in a simply connected domain D, then 


| F(2*) dz* 


Zo 


F(z) = 


is analytic in D and F'(z) = f(z). In the proof we used only the continuity of f(z) and the 
property that its integral around every closed path in D is zero; from these assumptions 
we concluded that F(z) is analytic. By Theorem 1, the derivative of F(z) is analytic, that 
is, f(z) is analytic in D, and Morera’s theorem is proved. ia 


This completes Chapter 14. 


PROBLEM SET 14-4 


1-7 


CONTOUR INTEGRATION. UNIT CIRCLE 


Integrate counterclockwise around the unit circle. 


INTEGRATION. DIFFERENT CONTOURS 


Integrate. Show the details. Hint. Begin by sketching the 
contour. Why? 


2+ sinz 
8. ———_,-dz, 
e @=) 


78 
ihe oe 
c @z— J) C the boundary of the square with 


ee e* Cos z ; 
3. <a dz, n=1,2, 4. =~ — dz vertices +2, +27 counterclockwise. 
cz c (z — 77/4) 
tan 77z : 2 2 . 
f cosh 2z f dz 9. 5 4, Cthe ellipse 16x" + y" = 1 clockwise. 
5 dz > Cz 
ce — 3)" é@=-27¢-127 


423 — 6 ; 
10. oe C consists of |z]| = 3 counter- 
c ez— 1-1) 


clockwise and |z| = 1 clockwise. 


2GIACINTO MORERA (1856-1909), Italian mathematician who worked in Genoa and Turin. 


11. 


12. 


13. 


14. 


15. 


16. 


17. 
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(1 + z)sinz : 
—— — dz, C:|z — i| = 2 counterclockwise. 
c 


C: z — 3i| = 2 clockwise. 


(g = 29° 
Ln z : 
7 dz, C:|z — 3| = 2 counterclockwise. 
ew 
Ln (z + 3) 
5 C the boundary of the square 
c @ — 2+ 17 


with vertices +1.5, 1.57, counterclockwise. 


h 4. 

; COs 3 de, 
c(— 4) 

wise and |z — 3| = 2 clockwise. 


4z 
; —__, 
c 2Z — 2i) 


clockwise and |z| 


; e “sing | 
c@z-4% ° 


wise and |z — 3| 


C consists of |z]| = 6 counterclock- 


C consists of |z — i| = 3 counter- 


= | clockwise. 


C consists of |z| = 5 counterclock- 


= 3 clockwise. 


18. 


19. 


20. 


sinh z ; : 
dz, C:|z| = 1 counterclockwise, n integer. 
Cc 


<" 


3z 

e : 
f 3 &, C: |z| = 1, counterclockwise. 
c (4z — Tri) 


TEAM PROJECT. Theory on Growth 


(a) Growth of entire functions. If f(z) is not a 
constant and is analytic for all (finite) z, and R and 
M are any positive real numbers (no matter how 
large), show that there exist values of z for which 
lz] > R and |f(z)| > M. Hint. Use Liouville’s 
theorem. 

(b) Growth of polynomials. If f(z) is a polynomial 

of degree n > 0 and M is an arbitrary positive 

real number (no matter how large), show that 

there exists a positive real number R such that 

| f(2)| > M for all |z| > R. 

(c) Exponential function. Show that f(z) = e” has 
the property characterized in (a) but does not have 
that characterized in (b). 

(d) Fundamental theorem of algebra. /f f(z) is a 

polynomial in z, not a constant, then f(z) = 0 for 

at least one value of z. Prove this. Hint. Use (a). 


CHAPTER 14 REVIEW QUESTIONS AND PROBLEMS 


1. 


11. 


What is a parametric representation of a curve? What 
is its advantage? 


. What did we assume about paths of integration z = z(t)? 


What is z = dz/dt geometrically? 


. State the definition of a complex line integral from 


memory. 


. Can you remember the relationship between complex 


and real line integrals discussed in this chapter? 


. How can you evaluate a line integral of an analytic 


function? Of an arbitrary continous complex function? 


. What value do you get by counterclockwise integration 


of 1/z around the unit circle? You should remember 
this. It is basic. 


. Which theorem in this chapter do you regard as most 


important? State it precisely from memory. 


. What is independence of path? Its importance? State a 


basic theorem on independence of path in complex. 


. What is deformation of path? Give a typical example. 
10. 


Don’t confuse Cauchy’s integral theorem (also known 
as Cauchy-Goursat theorem) and Cauchy’s integral 
formula. State both. How are they related? 


What is a doubly connected domain? How can you 
extend Cauchy’s integral theorem to it? 


12. 


13. 


14. 


15. 


16. 
17. 


18. 


19. Is 


20. 


What do you know about derivatives of analytic 
functions? 

How did we use integral formulas for derivatives in 
evaluating integrals? 

How does the situation for analytic functions differ 
with respect to derivatives from that in calculus? 
What is Liouville’s theorem? To what complex func- 
tions does it apply? 

What is Morera’s theorem? 

If the integrals of a function f(z) over each of the two 
boundary circles of an annulus D taken in the same 
sense have different values, can f(z) be analytic every- 
where in D? Give reason. 


Is Im ; {Qa = ; Im f(z) dz? Give reason. 
c c 


; f(z) dz 
C 


= ; | f(z)| dz? 
G 


How would you find a bound for the left side in Prob. 19? 


21-30 


INTEGRATION 


Integrate by a suitable method. 


21. 


| z sinh (z”) dz from 0 to Ti/2. 
Cc 
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22. | (\z| + z) dz clockwise around the unit circle. 27. | (2? + z7) dz from 0 to 2 + 2i, shortest path. 
Cc Cc 
23. | z°e* dz counterclockwise around |z| = 7. 28. $= = dz counterclockwise around gonad 
Cc cea 21)? 
24. | Rezdz from 0 to 3 + 27i along y = x°. : 
29. dz clockwise around 
GNe + at z+ an 


tan 77z _ 
gz clockwise around |z — 1| = 0.1. lz— 1] =25. 


30. | sin z dz from 0 to (1 + 2). 
26. | (2? + 2”) dz from z = 0 horizontally to z = 2, then c 

Cc 
vertically upward to 2 + 2i. 


SUMMARY—OF CHAPTER 14 


Complex Integration 


The complex line integral of a function f(z) taken over a path C is denoted by 


(1) | fo dz or, if C is closed, also by ; f@ (Sec. 14.1). 
Cc Cc 


If f(z) is analytic in a simply connected domain D, then we can evaluate (1) as in 
calculus by indefinite integration and substitution of limits, that is, 


(2) | fo dz = F(z1) — Fo) [F'@ =f@l 


Cc 


for every path C in D from a point zg to a point z; (see Sec. 14.1). These assumptions 
imply independence of path, that is, (2) depends only on zg and z; (and on f(z), 
of course) but not on the choice of C (Sec. 14.2). The existence of an F(z) such that 
F(z) = f(z) is proved in Sec. 14.2 by Cauchy’s integral theorem (see below). 

A general method of integration, not restricted to analytic functions, uses the 
equation z = z(t) of C, wherea St Sb, 


° : . 
(3) |r (2) dz = | f(e(D)z(f) at (: = “). 
Cc a 


Cauchy’s integral theorem is the most important theorem in this chapter. It states 
that if f(z) is analytic in a simply connected domain D, then for every closed path 
C in D (Sec. 14.2), 


(4) ; F@ dz = 0. 
c 
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Under the same assumptions and for any Zo in D and closed path C in D containing 
Zo in its interior we also have Cauchy’s integral formula 


277i Z = 26 


(5) ee ; IO) 
Cc 


Furthermore, under these assumptions f(z) has derivatives of all orders in D that are 
themselves analytic functions in D and (Sec. 14.4) 


(6) fo) _ n! ; FAC) se GSS ek: 
Cc 


Imi Ie (Z — 2)" 


This implies Morera’s theorem (the converse of Cauchy’s integral theorem) and 
Cauchy’s inequality (Sec. 14.4), which in turn implies Liouville’s theorem that an 
entire function that is bounded in the whole complex plane must be constant. 


CHAPTER | 5 


Power Series, Taylor Series 


In Chapter 14, we evaluated complex integrals directly by using Cauchy’s integral formula, 
which was derived from the famous Cauchy integral theorem. We now shift from the 
approach of Cauchy and Goursat to another approach of evaluating complex integrals, 
that is, evaluating them by residue integration. This approach, discussed in Chapter 16, 
first requires a thorough understanding of power series and, in particular, Taylor series. 
(To develop the theory of residue integration, we still use Cauchy’s integral theorem!) 

In this chapter, we focus on complex power series and in particular Taylor series. They 
are analogs of real power series and Taylor series in calculus. Section 15.1 discusses 
convergence tests for complex series, which are quite similar to those for real series. Thus, 
if you are familiar with convergence tests from calculus, you may use Sec. 15.1 as a 
reference section. The main results of this chapter are that complex power series represent 
analytic functions, as shown in Sec. 15.3, and that, conversely, every analytic function 
can be represented by power series, called a Taylor series, as shown in Sec. 15.4. The last 
section (15.5) on uniform convergence is optional. 


Prerequisite: Chaps. 13, 14. 
Sections that may be omitted in a shorter course: 15.1, 15.5. 
References and Answers to Problems: App. 1 Part D, App. 2. 


15.1 Sequences, Series, Convergence Tests 


The basic concepts for complex sequences and series and tests for convergence and 
divergence are very similar to those concepts in (real) calculus. Thus if you feel at home 
with real sequences and series and want to take for granted that the ratio test also holds 
in complex, skip this section and go to Section 15.2. 


Sequences 


The basic definitions are as in calculus. An infinite sequence or, briefly, a sequence, is 
obtained by assigning to each positive integer n a number Z,, called a term of the sequence, 
and is written 


Z1,22,°°" or {Z1, 22° °° } or briefly {Zn}. 


We may also write zo, Z1,°** OF Z2, Z3,°** or start with some other integer if convenient. 
A real sequence is one whose terms are real. 
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Convergence. A convergent sequence z1, Z2,°-: is one that has a limit c, written 
lim z, =c or simply Rig. 
no 


By definition of limit this means that for every € > 0 we can find an N such that 
(1) lzn —cl <e for all n > N; 


geometrically, all terms z,, with n > N lie in the open disk of radius € and center c (Fig. 361) 
and only finitely many terms do not lie in that disk. [For a real sequence, (1) gives an open 
interval of length 2€ and real midpoint c on the real line as shown in Fig. 362.] 

A divergent sequence is one that does not converge. 


x c-Eé Cc c+ée x 


Fig. 361. Convergent complex sequence Fig. 362. Convergent real sequence 


Convergent and Divergent Sequences 


The sequence {i"/n} = {i, -4, —i/3, i -++} is convergent with limit 0. 
The sequence {i"} = {i, —1, —i, 1,-+- } is divergent, and so is {z,} with z, = (1 + i)”. o 


Sequences of the Real and the Imaginary Parts 


The sequence {z,} with zz =Xn+ in =1 1/n? + i(2 + 4/n) is 6i,3 + 41,8 + 10i/3, 34 Bie, 
(Sketch it.) It converges with the limit c = 1 + 2i. Observe that {x,,} has the limit | = Rec and {y,} has 
the limit 2 = Imc. This is typical. It illustrates the following theorem by which the convergence of a 
complex sequence can be referred back to that of the two real sequences of the real parts and the imaginary 
parts. ia] 


Sequences of the Real and the Imaginary Parts 


A sequence 24, Z2,°°*,Zn,'°* of complex numbers Zp = Xy + ivn (where n= 1, 
2,°++) converges toc = a + ibifand only if the sequence of the real parts x1, X2,°** 
converges to a and the sequence of the imaginary parts y1, y2,°** converges to b. 


Convergence z,—c = a-+ ib implies convergence x,—a and y,—b because if 
lz, — c| < e, then z,, lies within the circle of radius € about c = a + ib, so that (Fig. 363a) 


lxn — al <€, lyn — b| <e. 


Conversely, if x, — a and y, — b as n— ~, then for a given € > 0 we can choose 
N so large that, for every n > N, 
. 


€ 
n-al<£, lyn 1 <§ 


2 
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at-—-—-—|---4 


(a) 
Fig. 363. Proof of Theorem 1 


These two inequalities imply that z, = x, + iyy lies in a square with center c and side 


e. Hence, z,, must lie within a circle of radius € with center c (Fig. 363b). B 
Series 
Given a sequence Z, Z2,°**, Zm,***, we may form the sequence of the sums 

Sy = Z15 Sg = Z1 + Za, $3 = Z1 + Zo + Z3, 


and in general 
(2) Sy =Z%ytZte + ZH (n = 1, 2,---). 


Here s,, is called the nth partial sum of the infinite series or series 
(3) Dea RH ee ee 
m=1 


The z 1, Z2,°°* are called the terms of the series. (Our usual summation letter is n, unless 
we need n for another purpose, as here, and we then use m as the summation letter.) 
A convergent series is one whose sequence of partial sums converges, say, 


co 


lim sy, = s. Then we write s= Si lm = Zit zgt °° 


nn 
m=1 


and call s the sum or value of the series. A series that is not convergent is called a divergent 
series. 
If we omit the terms of s,, from (3), there remains 


(4) Ry = Zn41 + Zn4+2 7 Zn43 7 °°". 


This is called the remainder of the series (3) after the term zy. Clearly, if (3) converges 
and has the sum s, then 


S = Sy, + Ry, thus Ry = S$ — Sy- 


Now s,,— s by the definition of convergence; hence R,, — 0. In applications, when s is 
unknown and we compute an approximation s,, of s, then |R,,| is the error, and R, — 0 
means that we can make |R,,| as small as we please, by choosing n large enough. 
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An application of Theorem | to the partial sums immediately relates the convergence 
of a complex series to that of the two series of its real parts and of its imaginary parts: 


Real and Imaginary Parts 


A series (3) with Zm = Xm + ivm converges and has the sum s = u + iv if and 
only if x1 + xg + +++ converges and has the sum u and yy + yg + +--+ converges 
and has the sum v. 


Tests for Convergence and Divergence of Series 


Convergence tests in complex are practically the same as in calculus. We apply them 
before we use a series, to make sure that the series converges. 
Divergence can often be shown very simply as follows. 


Divergence 


If a series zy + Z2 +++: converges, then lim Z,, = 0. Hence if this does not hold, 
A és moo 
the series diverges. 


If z1 + za + --: converges, with the sum s, then, since Z, = Sm — Sm-—1, 
lim zm = lim (sy, — Ssm-1) = lim sy, — lim sm_1 = 5s — s = 0. ia 
mo m2 mx mx 


CAUTION! Zz m— 0 is necessary for convergence but not sufficient, as we see from the 
harmonic series | + 5 “+ 3 “F i + -++, which satisfies this condition but diverges, as is 
shown in calculus (see, for example, Ref. [GenRef11] in App. 1). 


The practical difficulty in proving convergence is that, in most cases, the sum of a series 
is unknown. Cauchy overcame this by showing that a series converges if and only if its 
partial sums eventually get close to each other: 


Cauchy’s Convergence Principle for Series 


A series Z1 + Z2 + ++: is convergent if and only if for every given € > 0 (no matter 
how small) we can find an N (which depends on e€, in general) such that 


(5) |zntit Znzgteo + ee <e for every n > N and p = 1, 2,:°- 


The somewhat involved proof is left optional (see App. 4). 


Absolute Convergence. A series z} + zo + -:: is called absolutely convergent if the 
series of the absolute values of the terms 


SD zm! = lzal + Izal + -- 


m=1 


is convergent. 
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If zy + zo + -+- converges but |z,| + |zo| + --- diverges, then the series z, + zg + °°: 


is called, more precisely, conditionally convergent. 


A Conditionally Convergent Series 


The series 1 — 3 + 3 = 3 + —-++ converges, but only conditionally since the harmonic series diverges, as 


mentioned above (after Theorem 3). 


If a series is absolutely convergent, it is convergent. 


This follows readily from Cauchy’s principle (see Prob. 29). This principle also yields 


the following general convergence test. 


Comparison Test 


with nonnegative real terms such that |z1| S by, 
converges, even absolutely. 


Ifa series z1 + Zg + +--+ is given and we can find a convergent series by + by + -:: 
zal = bo,:--, then the given series 


By Cauchy’s principle, since bj + bo + --- converges, for any given € > 0 we can find 


an N such that 


by+i tet: + bnip <€ for every n > N and p = 1, 2,::-. 


From this and |z1| S by, |zo| S bo,-++ we conclude that for those n and p, 


Izneal +°** + [Zntpl S nga t +++ + Paap <6 


Hence, again by Cauchy’s principle, 
absolutely convergent. 


A good comparison series is the geometric series, which behaves as follows. 


zil + lzol + +++ converges, so that z; + zo +- 


+. 4S 


Geometric Series 


The geometric series 


(6*) Sa alt gt gre 


m=0 


converges with the sum 1/(1 — q) if |q| < 1 and diverges if |q| = 1. 


If lq| = |, then la™*| = | and Theorem 3 implies divergence. 
Now let |g| < 1. The nth partial sum is 


S, = ltaqt--- tq”. 


From this, 


Gh geet gh gh. 
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On subtraction, most terms on the right cancel in pairs, and we are left with 


n+1 


Sn Gn = (1 Qsn = 1 q 


Now | — gq # 0 since g # 1, and we may solve for s,,, finding 
_ es 1 7 qk 
Ea 


(6) Sn = 


Since Iq| <1, the last term approaches zero as n >. Hence if lq| < 1, the series is 
convergent and has the sum 1/(1 — q). This completes the proof. a 


Ratio Test 


This is the most important test in our further work. We get it by taking the geometric 
series as comparison series by + bo + -:- in Theorem 5: 


Ratio Test 


Ifa series zy + Zg + +++ withzZ, # O(n = 1, 2,---) has the property that for every 


n greater than some N, 


Znt+1 
in 


(7) <q<1 


(n > N) 


(where q < | is fixed), this series converges absolutely. If for every n > N, 


(8) (n > N), 


the series diverges. 


If (8) holds, then lepacall = IZ for n > N, so that divergence of the series follows from 
Theorem 3. 
If (7) holds, then lzn+1 = IZ q for n > N, in particular, 


Izwisl S lewsela S lzwail?, etc., 


Izn+el = lznealg, 


and in general, Izn+pl = lenweala? 7. Since g < 1, we obtain from this and Theorem 6 


1 
l-q 


lzwaal + lzweel + lzwasl + --- Slenusld tqatq?t +) S lena 


Absolute convergence of zj + zg +--+ now follows from Theorem 5. ei] 
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CAUTION! The inequality (7) implies |z741/zn| < 1, but this does not imply con- 
vergence, as we see from the harmonic series, which satisfies z,41/Zn = n/(n + 1) < 1 for 
all n but diverges. 


If the sequence of the ratios in (7) and (8) converges, we get the more convenient 


Ratio Test 
Zn+1 


Ifa series z1 + Za + +++ withz, # O(n = 1,2,--+-) is such that lim 
then: 


=i, 
n 
(a) If L <1, the series converges absolutely. 

(b) If L > 1, the series diverges. 


(c) If L = 1, the series may converge or diverge, so that the test fails and 
permits no conclusion. 


(a) We write ky = |Zn+1/Zn| and let L = 1 — b < 1. Then by the definition of limit, the 
ky, must eventually get close to 1 — b, say, ky, Sq = 1 - xb < 1 for all n greater than 
some NV. Convergence of z1 + zo + -:- now follows from Theorem 7. 
(b) Similarly, forL = 1 +c > 1 wehavek, 21+ xc > 1 for all n > N* (sufficiently 
large), which implies divergence of z1 + zg + --- by Theorem 7. 
(c) The harmonic series 1 + 4 + % +°:* has Zy44/Zn = n/(n + 1), hence L = 1, and 
diverges. The series 

i, 1 Zn+1 


Pe ed + p Aus has =a 
4° 9 16 25 ZnO (n + 1)?’ 


hence also L = 1, but it converges. Convergence follows from (Fig. 364) 


n 
1 1 dx 1 
Sn = 1 eyo + w=! + | 2 =2-—-, 


1 


so that 51, 59,-:- is a bounded sequence and is monotone increasing (since the terms of 
the series are all positive); both properties together are sufficient for the convergence of 
the real sequence sy, s9,---. (In calculus this is proved by the so-called integral test, whose 
idea we have used.) B 


0) 1 2 3 4 x 


Fig. 364. Convergence of the series1 +i + 5+%+°:: 
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EXAMPLE 4 _ Ratio Test 
Is the following series convergent or divergent? (First guess, then calculate.) 
(100 + 751)" 


= 1 
>> 1 + (100 + 757) 4 (100 + 75i)? +++: 
n! 2! 


n=0 


Solution. By Theorem 8, the series is convergent, since 


100 + 75i/"**/(n + 1)! [100 + 75i 125 
_! yes | il _ > L=0. a 
|100 + 75i|"/n! n+1 n+1 


Zn+1 


in 


EXAMPLE 5. Theorem 7 More General Than Theorem 8 


Let ay, = i/2?” and b, = 1/23”*1. Is the following series convergent or divergent? 


ago + bo tay t+ by +--+ =i 4 + + + t Poets 


Solution. The ratios of the absolute values of successive terms are 3 . x, x, 3, --+, Hence convergence follows 
from Theorem 7. Since the sequence of these ratios has no limit, Theorem 8 is not applicable. | 
Root Test 


The ratio test and the root test are the two practically most important tests. The ratio test 
is usually simpler, but the root test is somewhat more general. 


THEOREM 9 Root Test 


If a series z1 + Za +++: is such that for every n greater than some N, 
(9) Vienl Sq <1 (n > N) 
(where q < 1 is fixed), this series converges absolutely. If for infinitely many n, 


(10) Vi2n| = 1, 


the series diverges. 


PROOF If (9) holds, then lZn =q" <1 for all n >N. Hence the series \z4| + \zo| = 
converges by comparison with the geometric series, so that the series z3 + za + °:- 
converges absolutely. If (10) holds, then |z,,| = 1 for infinitely many n. Divergence of 
Z1 + zq + -:: now follows from Theorem 3. | 


CAUTION! Equation (9) implies V/|z,,| < 1, but this does not imply convergence, as 
we see from the harmonic series, which satisfies V1 /n < 1 (for n > 1) but diverges. 
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If the sequence of the roots in (9) and (10) converges, we more conveniently have 


Root Test 


If a series z1 + Za +++ is such that lim V/ lzn| = L, then: 
nw 


(a) The series converges absolutely if L < 1. 
(b) The series diverges if L > 1. 
(c) If L = 1, the test fails; that is, no conclusion is possible. 


PROBLEEM—SET 15-1 


1-10 | SEQUENCES 
Is the given sequence 21, Z9,°°*,Zn,°** bounded? Con- 
vergent? Find its limit points. Show your work in detail. 
lL. zy, = (1 + 2"/2" 2. Zn = (3 + 4i)"/n! 

3. Zn = nt/(4 + 2ni) 4, zn = (1+ 21)” 

5. Z, = (-1)" + 107 6. Zn = (cos n7ri)/n 

7. Zn =n? t+ i/n? 8. zn = [1 + 3/V10]" 
9. Zn = (3 + 3i-” 10. zn = sin (An) + i” 
11. CAS EXPERIMENT. Sequences. Write a program 


12. 


13. 


14. 


15. 


for graphing complex sequences. Use the program to 
discover sequences that have interesting “geometric” 
properties, e.g., lying on an ellipse, spiraling to its limit, 
having infinitely many limit points, etc. 

Addition of sequences. If z1, z2,--- converges with 
the limit / and zi, z5,--- converges with the limit /*, 
show that z1 + zj, z2 + z5,--- is convergent with the 
limit / + /*, 

Bounded sequence. Show that a complex sequence 
is bounded if and only if the two corresponding 
sequences of the real parts and of the imaginary parts 
are bounded. 


On Theorem 1. Illustrate Theorem | by an example 
of your own. 


On Theorem 2. Give another example illustrating 
Theorem 2. 


16-25 


SERIES 


Is the given series convergent or divergent? Give a reason. 
Show details. 


16. 


18. 


5 (20 + 301)” iy: s (-i)" 
5 n! aa Inn 
S (2) o> 
n=1 4 n=0 mad 


20. 


21. 


22. 


23. 


24. 


25. 


26. 


27. 


28. 


29. 


30. 


2n+1 


=. (7 + Tri) 
2 (2n + 1)! 


iia 
(2n)! 
(3i)"n! 


is 

n=1 

Significance of (7). What is the difference between (7) 
and just stating |zn44/Znl < 1? 


On Theorems 7 and 8. Give another example showing 
that Theorem 7 is more general than Theorem 8. 


CAS EXPERIMENT. Series. Write a program for 
computing and graphing numeric values of the first n 
partial sums of a series of complex numbers. Use the 
program to experiment with the rapidity of convergence 
of series of your choice. 


Absolute convergence. Show that if a series converges 
absolutely, it is convergent. 


Estimate of remainder. Let |z,+1/znl Sq < 1, so 
that the series z1 + zg + --- converges by the ratio test. 
Show that the remainder Ry = Zn+1 + Znt24+°°° 
satisfies the inequality |Ry| S |zn+1l/( — q). Using 
this, find how many terms suffice for computing the 
sum s of the series 


Ss nt+i 
v3 
et Qn 


with an error not exceeding 0.05 and compute s to this 
accuracy. 
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15.2 Power Series 


EXAMPLE 1 


EXAMPLE 2 


The student should pay close attention to the material because we shall show how power 
series play an important role in complex analysis. Indeed, they are the most important series 
in complex analysis because their sums are analytic functions (Theorem 5, Sec. 15.3), and 
every analytic function can be represented by power series (Theorem 1, Sec. 15.4). 

A power series in powers of z — Zo 1s a series of the form 


-) 


(1) SD, an(z — 20)” = ao + ax(z — 20) + a2(z — Zo)? + °° 
n=0 
where z is a complex variable, do, a,,--- are complex (or real) constants, called the 


coefficients of the series, and zg is a complex (or real) constant, called the center of the 
series. This generalizes real power series of calculus. 
If zo = 0, we obtain as a particular case a power series in powers of Zz: 


(2) Sanz” = ag + ayz + gz? + +: 
n=0 


Convergence Behavior of Power Series 


Power series have variable terms (functions of z), but if we fix z, then all the concepts 
for series with constant terms in the last section apply. Usually a series with variable 
terms will converge for some z and diverge for others. For a power series the situation is 
simple. The series (1) may converge in a disk with center zg or in the whole z-plane or 
only at zo. We illustrate this with typical examples and then prove it. 


Convergence in a Disk. Geometric Series 


The geometric series 


converges absolutely if |z| < 1 and diverges if |z| = 1 (see Theorem 6 in Sec. 15.1). 4] 


Convergence for Every z 


The power series (which will be the Maclaurin series of e* in Sec. 15.4) 


a a ee 

>» l+z+—+—+4 
! 

n=0 c 


is absolutely convergent for every z. In fact, by the ratio test, for any fixed z, 


n+1 
z"/(a + 1)! 
| > 0 as n— @, |_| 


z"/n! 
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EXAMPLE 3 _ Convergence Only at the Center. (Useless Series) 


The following power series converges only at z = 0, but diverges for every z # 0, as we shall show. 


In fact, from the ratio test we have 


(n + 1)le™*4 
ion =(n+ Ilz| > © as no (z fixed and #0). Mf 
niz 
THEOREM 1 Convergence of a Power Series 


(a) Every power series (1) converges at the center Zo. 


(b) Jf (1) converges at a point z = z1 # Zo, it converges absolutely for every z 
closer to Zo than zy, that is, |z — zo| < |z1 — zol. See Fig. 365. 


(c) If (1) diverges at z = Ze, it diverges for every z farther away from Zo than 
Zg. See Fig. 365. 


----~ 


- ie ~ Divergent 


\ 
\ 
1 
$2 
4 


Fig. 365. Theroem 1 


PROOF (a) For z = Zo the series reduces to the single term do. 


(b) Convergence at z = z, gives by Theorem 3 in Sec. 15.1 a,(zy — zo)" > Oasn— ~. 
This implies boundedness in absolute value, 


la,(z1 — Zo)"| <M for every n = 0,1,---. 
Multiplying and dividing a,,(z — zo)" by (zy; — zo)” we obtain from this 


n 
mj. £&O 
Ay(Z1 — Zo)” Pa — 2) | =M 


Zz. 20 


— nr) ee 
lan(z — zo)” | = 


Summation over n gives 


oo co 


(3) > lanz - zo)"| = u> 
n=1 


n=1 


z — £0 
Z1 — 20 


Now our assumption lz — zol < lz, — zol implies that \(z - zo)/(Z1 — zo)| < 1. Hence 
the series on the right side of (3) is a converging geometric series (see Theorem 6 in 
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Sec. 15.1). Absolute convergence of (1) as stated in (b) now follows by the comparison 
test in Sec. 15.1. 


(c) If this were false, we would have convergence at a z3 farther away from zg than Zo. 
This would imply convergence at Zg, by (b), a contradiction to our assumption of divergence 
at z 2. ia) 


Radius of Convergence of a Power Series 


Convergence for every z (the nicest case, Example 2) or for no z # Zo (the useless case, 
Example 3) needs no further discussion, and we put these cases aside for a moment. We 
consider the smallest circle with center zg that includes all the points at which a given 
power series (1) converges. Let R denote its radius. The circle 


Iz—zol =R (Fig. 366) 


is called the circle of convergence and its radius R the radius of convergence of (1). Theorem 
1 then implies convergence everywhere within that circle, that is, for all z for which 


(4) Iz- zl <R 


(the open disk with center zg and radius R). Also, since R is as small as possible, the series 
(1) diverges for all z for which 


(5) Iz-— zl > R. 
No general statements can be made about the convergence of a power series (1) on the 
circle of convergence itself. The series (1) may converge at some or all or none of the 


points. Details will not be important to us. Hence a simple example may just give us 
the idea. 


Divergent 


Fig. 366. Circle of convergence 


EXAMPLE 4 _ Behavior on the Circle of Convergence 


On the circle of convergence (radius R = | in all three series), 
> <"/n* converges everywhere since > 1/n? converges, 
>z"/n converges at —1 (by Leibniz’s test) but diverges at 1, 


+z" diverges everywhere. fal 
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THEOREM 2 


PROOF 


EXAMPLE 5 


Notations R = and R = 0. To incorporate these two excluded cases in the present 
notation, we write 


R = © if the series (1) converges for all z (as in Example 2), 
R= 0 if (1) converges only at the center z = Zo (as in Example 3). 


These are convenient notations, but nothing else. 


Real Power Series. In this case in which powers, coefficients, and center are real, 
formula (4) gives the convergence interval |x — xo| < R of length 2R on the real line. 


Determination of the Radius of Convergence from the Coefficients. For this important 
practical task we can use 


Radius of Convergence R 


Suppose that the sequence Gel Ge, ,n = 1,2,---, converges with limit ae If 
L* = 0, then R = ~; that is, the power series (1) converges for all z. If L* # 0 
(hence iS 0), then 


1 : 
(6) R= hae lim 


no 


(Cauchy—Hadamard formula’). 


An+1 


If | edie as 1/Anl — ©, then R = 0 (convergence only at the center Zo). 


For (1) the ratio of the terms in the ratio test (Sec. 15.1) is 


ell 
On+1(Z — Zo)” 
Ap(Z — Zo)" 


an+1 


lz—zol.  Thelimitis L=L*|z— zol. 


an 


Let 2" + 0, thus L* >0. We have convergence if L= L*|z _ Zo| <1, thus 
lz — zol < 1/L*, and divergence if |z — zo| > 1/L*. By (4) and (5) this shows that 1/L* 
is the convergence radius and proves (6). 

If L* = 0, then L = 0 for every z, which gives convergence for all z by the ratio test. 
If |ay41/an| > ~, then |ay+1/an||z — zol > 1 for any z # Zo and all sufficiently large 
n. This implies divergence for all z # Zo by the ratio test (Theorem 7, Sec. 15.1). a 


Formula (6) will not help if L™ does not exist, but extensions of Theorem 2 are still 
possible, as we discuss in Example 6 below. 


Radius of Convergence 


= (2n)! 
By (6) the radius of convergence of the power series > —— (z — 3i)” is 
n=0 (nl)? 
(2n!) / (2n + 2)! ; (2n!) (n+ 1)? ; (n + 1)? 1 
R im 7 a lim : 7 lim : 
n= | (nty?/ (n + 1!) n= | (2n + 2)! (n!y n> (2n + 2)(2n + 1) 4 
The series converges in the open disk |z — 3i| < + of radius 4 and center 3i. Bo 


1Named after the French mathematicians A. L. CAUCHY (see Sec. 2.5) and JACQUES HADAMARD 
(1865-1963). Hadamard made basic contributions to the theory of power series and devoted his lifework to 
partial differential equations. 
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EXAMPLE 6 _ Extension of Theorem 2 


Find the radius of convergence R of the power series 


= 1 1 1 1 1 
> }14+ Cy" 4 zg? =34 ct(2+t)e: o+(23 )tae 
5 2” 2 4 8 16 


Solution. The sequence of the ratios d, 2(2.+ 4), 1/(8(2 + 4)), -+- does not converge, so that Theorem 2 is 
of no help. It can be shown that 


(6*) R=\1/L, L = lim Wla,. 


nx 


This still does not help here, since ( V |dy,|) does not converge because V la,| = V1 /2% = 3 for odd n, whereas 
for even n we have 


Vian] = W2+ 1/2" > 1 as no, 
so that V/ |a,| has the two limit points 5 and 1. It can further be shown that 
(6**) R= 1/1, 7 the greatest limit point of the sequence \W lanl}. 


Here 7 = J, so that R = 1. Answer. The series converges for |z| < 1. ie 


Summary. Power series converge in an open circular disk or some even for every z (or 
some only at the center, but they are useless); for the radius of convergence, see (6) or 
Example 6. 

Except for the useless ones, power series have sums that are analytic functions (as we 
show in the next section); this accounts for their importance in complex analysis. 


PROBLEM SET 45-2 


; Z 24000, > 4+ 73/2 at = nn — 1 
1. Power series. Are 1/z + zt t andz + 2/" + 8. SG mi" 2S (n : Vig — en 
z” + z° + +++ power series? Explain. it n} oh. 3 
2. Radius of convergence. What is it? Its role? What = (2 — 2i)" 2/9; 
motivates its name? How can you find it? 10. > a 11. > (. ri ‘) Z 
i 
3. Convergence. What are the only basically different n=o n=0 
possibilities for the convergence of a power series? 2 (-1"n a aa 
12. me 13. 16"(z + i)” 
4. On Examples 1-3. Extend them to power series in = 8” : = oer 
powers of z — 4 + 377i. Extend Example | to the case = (-1)" = (On)! 
; = n)! 
of radius convergence 6. 14, > maee 15. S Fant Gai" 
5. Powers z~”. Show that if Sanz’ has radius of n=0 Me = ‘ 
convergence R (assumed finite), then Sanz” has = (3n)! es Qn ta 
5 * n pont 
radius of convergence VR. 16. x ns Zz 17. > nn + 1) 
n= n= 
6-18 | RADIUS OF CONVERGENCE 2 2-1)" 
: : 18. > a tt 
Find the center and the radius of convergence. : Var(2n + 1)n! 
2n in 
< rege 1 19. CAS PROJECT. Radius of C Writ 
6. A’: + 1)” 7, et F . Radius of Convergence. Write a 
py i ) py (2n)! (: 7) program for computing R from (6), (6*), or (6**), in 


n=0 n=0 


SEC. 15.3 Functions Given by Power Series 


20. 


this order, depending on the existence of the limits 
needed. Test the program on some series of your choice 
such that all three formulas (6), (6*), and (6**) will 
come up. 


TEAM PROJECT. Radius of Convergence. 


(a) Understanding (6). Formula (6) for R contains 
ldy/an+1|, not |a,+1/a,|. How could you memorize 
this by using a qualitative argument? 

(b) Change of coefficients. What happens to R 
(0<R< ©) if you (i) multiply all a, by k #0, 
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(ii) multiply all a, by k” # 0, (iii) replace a, by 
1/a,? Can you think of an application of this? 

(c) Understanding Example 6, which extends 
Theorem 2 to nonconvergent cases of dy/dy+1. 
Do you understand the principle of “mixing” by 
which Example 6 was obtained? Make up further 
examples. 

(d) Understanding (b) and (c) in Theorem 1. Does 
there exist a power series in powers of z that converges 
at z = 30 + 107 and diverges at z = 31 — 6i? Give 
reason. 


15.3 Functions Given by Power Series 


Here, our main goal is to show that power series represent analytic functions. This fact 
(Theorem 5) and the fact that power series behave nicely under addition, multiplication, 
differentiation, and integration accounts for their usefulness. 

To simplify the formulas in this section, we take zg = O and write 


d) 


foo} 
>» Anz”. 
n=0 


There is no loss of generality because a series in powers of Z— zo with any zo can always 
be reduced to the form (1) if we set Z — zo = z. 


Terminology and Notation. 


If any given power series (1) has a nonzero radius of 


convergence R (thus R > 0), its sum is a function of z, say f(z). Then we write 


(2) FQ = > anz™ = ag + az + age? + > 
n=0 


(Iz] < R). 


We say that f(z) is represented by the power series or that it is developed in the power 
series. For instance, the geometric series represents the function f(z) = 1/(1 — z) in the 
interior of the unit circle |z| = 1. (See Theorem 6 in Sec. 15.1.) 


Uniqueness of a Power Series Representation. 


This is our next goal. It means that a 


function f(z) cannot be represented by two different power series with the same center. 
We claim that if f(z) can at all be developed in a power series with center zo, the 
development is unique. This important fact is frequently used in complex analysis (as well 
as in calculus). We shall prove it in Theorem 2. The proof will follow from 


THEOREM 1 


Continuity of the Sum of a Power Series 


If a function f(z) can be represented by a power series (2) with radius of convergence 
R > 0, then f(z) is continuous at z = 0. 
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PROOF 


THEOREM 2 


PROOF 
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From (2) with z = 0 we have f(0) = dap. Hence by the definition of continuity we 
must show that lim, f(z) = f(0) = do. That is, we must show that for a given € > 0 
there is a 5 > 0 such that |z| < 6 implies | f(z) — ag| < €. Now (2) converges abso- 
lutely for |z] S r with any r such that 0 <r < R, by Theorem 1 in Sec. 15.2. Hence 
the series 


Co 1 foo} 
> ale = Tr > layla” 


converges. Let S # 0 be its sum. (S = 0 is trivial.) Then for 0 < zl Sr, 


[f@) — aol = 


oo 
> ae 
n=1 


co foo] 
= lz) > laallzl"-* = lel > lanir™* = lels 
n=1 n=1 


and |z|S <e when |z| <5, where & > 0 is less than r and less than e/S. Hence 
Iz|S << 6S < (€/S)S = e. This proves the theorem. |_| 


From this theorem we can now readily obtain the desired uniqueness theorem (again 
assuming Zg = O without loss of generality): 


Identity Theorem for Power Series. Uniqueness 


Let the power series dg + ayz + a +--+ and by + byz + byz +--+ both be 
convergent for |z| < R, where R is positive, and let them both have the same sum 
for all these z. Then the series are identical, that is, dg = bo, ay = by, dg = be,::: 

Hence if a function f(z) can be represented by a power series with any center Zo, 
this representation is unique. 


We proceed by induction. By assumption, 
ag + ayz + agz2 tes =bo thyzthzt-:: (lz| < R). 


The sums of these two power series are continuous at z = 0, by Theorem 1. Hence if we 
consider |z| > 0 and let z—>0 on both sides, we see that ag = bo: the assertion is true 
for n = 0. Now assume that a, = b, for n = 0, 1,---,m. Then on both sides we may 
omit the terms that are equal and divide the result by 2" (= 0); this gives 


2 2 
Am+1 + Am+22 + Om432° + °° = bm4i + bm42z + bm43z° +o. 
Similarly as before by letting z— 0 we conclude from this that aj,41 = by+1. This 


completes the proof. ia 


Operations on Power Series 


Interesting in itself, this discussion will serve as a preparation for our main goal, namely, 
to show that functions represented by power series are analytic. 
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THEOREM 3 


PROOF 


Termwise addition or subtraction of two power series with radii of convergence R, and 
Ry yields a power series with radius of convergence at least equal to the smaller of Ry, 
and Ry. Proof. Add (or subtract) the partial sums s,, and sj; term by term and use 
lim (s, + s*) = lim s,, + lim s*. 


Termwise multiplication of two power series 


FQ) = > az = ay + yz +> 
k=0 


and 


82) = >) bnz™ = bo t+ yz t+ 


m=0 


means the multiplication of each term of the first series by each term of the second series 
and the collection of like powers of z. This gives a power series, which is called the 
Cauchy product of the two series and is given by 


aobo + (dgby + aybo)z + (dg be + a,b, + aobo)z” +-:- 


= S (obn + aby-1 + +++ + an bo)z”. 
n=0 


We mention without proof that this power series converges absolutely for each z within 
the smaller circle of convergence of the two given series and has the sum s(z) = f(z)g(z). 
For a proof, see [D5] listed in App. 1. 


Termwise differentiation and integration of power series is permissible, as we show 
next. We call derived series of the power series (1) the power series obtained from (1) 
by termwise differentiation, that is, 


(3) > Nz” 1 = ay + 2agz + 3a3z7 + ++: 
n=1 


Termwise Differentiation of a Power Series 


The derived series of a power series has the same radius of convergence as the 
original series. 


This follows from (6) in Sec. 15.2 because 


ay, an 


n | | : n : 
lim 


im = lim = lim 
n> (n+ l)ldny,| nee n+ 1 n> 


Nn>n 


QAn+1 An+1 


or, if the limit does not exist, from (6**) in Sec. 15.2 by noting that Wn —>lasn—o, 
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THEOREM 4 


THEOREM 5 


PROOF 
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Application of Theorem 3 

Find the radius of convergence R of the following series by applying Theorem 3. 
ne 

Ss (sje = 27 + 3z9 + 624 + 1022 + °°. 

n=2 


Solution. Differentiate the geometric series twice term by term and multiply the result by Zz] 2. This yields 
the given series. Hence R = | by Theorem 3. 


Termwise Integration of Power Series 


The power series 


o 
an ay a2 
+1 2 3 
> f° Saget oe Fe 
n+1 2 3 
n=0 


obtained by integrating the series dg + ayz + doz” + +++ term by term has the same 
radius of convergence as the original series. 


The proof is similar to that of Theorem 3. 
With the help of Theorem 3, we establish the main result in this section. 


Power Series Represent Analytic Functions 


Analytic Functions. Their Derivatives 


A power series with a nonzero radius of convergence R represents an analytic 
function at every point interior to its circle of convergence. The derivatives of this 
function are obtained by differentiating the original series term by term. All the 
series thus obtained have the same radius of convergence as the original series. 
Hence, by the first statement, each of them represents an analytic function. 


(a) We consider any power series (1) with positive radius of convergence R. Let f(z) be 
its sum and /{(z) the sum of its derived series; thus 


(4) f@ = >; anz” and Ai@ = >) nayz”*. 


n=0 n=1 


We show that f(z) is analytic and has the derivative f;(z) in the interior of the circle of 
convergence. We do this by proving that for any fixed z with |z| < R and Az—0 the 
difference quotient [ f(z + Az) — f(z)]/Az approaches f;(z). By termwise addition we first 
have from (4) 


+ Az) - x + Az)” — 2% 
(5) f(z + Az) — f(@) es a, (+ Az —z ng} 
n=2 


Az Az 


Note that the summation starts with 2, since the constant term drops out in taking the 
difference f(z + Az) — f(z), and so does the linear term when we subtract f{(z) from the 
difference quotient. 
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(b) We claim that the series in (5) can be written 


(6) Dd a, Ade + Ag”? + 2ez + Ag 3 + +) + — 2)z" Fz + Ad 
n=2 
+ (n — 1)z2"77). 


The somewhat technical proof of this is given in App. 4. 


(c) We consider (6). The brackets contain n — | terms, and the largest coefficient is 
n — 1. Since (n — 1)? = n(n — 1), we see that for lz| = Ro and lz + Az| Ss Ro, Ro < R, 
the absolute value of this series (6) cannot exceed 


(7) [Az] > lanlan — )R57?. 


N=2 


This series with a, instead of |a,,| is the second derived series of (2) at z = Ro and 
converges absolutely by Theorem 3 of this section and Theorem | of Sec. 15.2. Hence 
our present series (7) converges. Let the sum of (7) (without the factor |Az|) be K (Ro). 
Since (6) is the right side of (5), our present result is 


f(z + Az) — fi) 
Az 


A@| = |AzlK(Ro). 


Letting Az—0 and noting that Ro (< R) is arbitrary, we conclude that f(z) is analytic at 
any point interior to the circle of convergence and its derivative is represented by the derived 
series. From this the statements about the higher derivatives follow by induction. ia 


Summary. The results in this section show that power series are about as nice as we 
could hope for: we can differentiate and integrate them term by term (Theorems 3 and 4). 
Theorem 5 accounts for the great importance of power series in complex analysis: the 
sum of such a series (with a positive radius of convergence) is an analytic function and 
has derivatives of all orders, which thus in turn are analytic functions. But this is only 
part of the story. In the next section we show that, conversely, every given analytic function 
f(z) can be represented by power series, called Taylor series and being the complex analog 
of the real Taylor series of calculus. 


PROBLEM SET 15-3 


1. 


Relation to Calculus. Material in this section gener- 
alizes calculus. Give details. 


. Termwise addition. Write out the details of the proof 


on termwise addition and subtraction of power series. 


. On Theorem 3. Prove that Vn—1 as n—>~, as 


5-15 


RADIUS OF CONVERGENCE 
BY DIFFERENTIATION OR INTEGRATION 


Find the radius of convergence in two ways: (a) directly by 
the Cauchy—Hadamard formula in Sec. 15.2, and (b) from a 
series of simpler terms by using Theorem 3 or Theorem 4. 


claimed. 2 - oe 2n+1 
= eS 1) 1)” 
. Cauchy product. Show that(1 — 27? = SS (n + 1)z” - Pars @- 2)" 6. Dr In +1\20 
n=0 
(a) by using the Cauchy product, (b) by differentiating “on 5 a 
te +2 
a suitable series. 7 2 gr - i) De ay ie + Ne 
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(—2)” 2n 
n(n + In + 2) * 


10. 


_- + 1) 
ete ces (z a 2)" 


>> 


2n(2n — 1) 


by —— 


13. s (" i P| ae 
i> 4 . ”) 2” 


eu _ De 


~2n-2 


15; Dare =i) 


16-20; APPLICATIONS 


OF THE IDENTITY THEOREM 


State clearly and explicitly where and how you are using 
Theorem 2. 


16. Even functions. If f(z) in (2) is even (ie., 
t(-z) = f(2), show that a, = 0 for odd n. Give 
examples. 


Power Series, Taylor Series 


17. 


18. 


19. 


20. 


Odd function. If f(z) in (2) is odd (1.e., f(—z) = —f(z)), 
show that a,, = 0 for even n. Give examples. 


Binomial coefficients. Using (1 + 2? + 2%= 


(1 + z)?*4 obtain the basic relation 


es sel 


n=0 


Find applications of Theorem 2 in differential equa- 
tions and elsewhere. 


TEAM PROJECT. Fibonacci numbers.” (a) The 
Fibonacci numbers are recursively defined by 
Ag =a, = 1, Qn41=&+ay-1 if n=1,2,:°- 
Find the limit of the sequence (ay, 44/dy). 

(b) Fibonacci’s rabbit problem. Compute a list of 
a1,°°*, 42. Show that dy. = 233 is the number 
of pairs of rabbits after 12 months if initially there 
is 1 pair and each pair generates 1 pair per month, 
beginning in the second month of existence (no deaths 
occurring). 


(c) Generating function. Show that the generating 
function of the Fibonacci numbers is /(z) = 
i/d-z- 2): that is, if a power series (1) represents 
this f(z), its coefficients must be the Fibonacci numbers 
and conversely. Hint. Start from f(z)(1 — z — 2?) = 
and use Theorem 2. 


15.4 Taylor and Maclaurin Series 


The Taylor series® of a function f(z), the complex analog of the real Taylor series is 


2) 


(1) 7@) = > a — 20) 


n=1 


or, by (1), Sec. 14.4, 


(2) Ay = 


1 
where In = f oa) 


ok 
: ; is i daa 
pat 


Gee — Fey) 


In (2) we integrate counterclockwise around a simple closed path C that contains zg in its 
interior and is such that f(z) is analytic in a domain containing C and every point inside C. 
A Maclaurin series? is a Taylor series with center zg = 0. 


2LEONARDO OF PISA, called FIBONACCI (= son of Bonaccio), about 1180-1250, Italian mathematician, 
credited with the first renaissance of mathematics on Christian soil. 

3BROOK TAYLOR (1685-1731), English mathematician who introduced real Taylor series. COLIN 
MACLAURIN (1698-1746), Scots mathematician, professor at Edinburgh. 
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The remainder of the Taylor series (1) after the term a,,(z — zg)” is 


= n+1 aa 
3) R,(2) = (2 — Zo) ; F(Z*) 


; ; me 
277i z* — zo) t1(z* — 2) 


(3 


(proof below). Writing out the corresponding partial sum of (1), we thus have 


ace Gea 
ie) 0 ee CO 


(i) 
+ = 


n! 


S(@) = fZo) + 
(4) 


Fo) + Rn. 


This is called Taylor’s formula with remainder. 


We see that Taylor series are power series. From the last section we know that power 
series represent analytic functions. And we now show that every analytic function can be 
represented by power series, namely, by Taylor series (with various centers). This makes 
Taylor series very important in complex analysis. Indeed, they are more fundamental in 
complex analysis than their real counterparts are in calculus. 


Taylor’s Theorem 


Let f(z) be analytic in a domain D, and let z = Zg be any point in D. Then there 
exists precisely one Taylor series (1) with center zo that represents f(z). This 
representation is valid in the largest open disk with center zg in which f(z) is analytic. 
The remainders R,(z) of (1) can be represented in the form (3). The coefficients 
satisfy the inequality 


M 
(5) la,| = 

- 
where M is the maximum of | f(z) on a circle |z — Zol = rin D whose interior is 
also in D. 


The key tool is Cauchy’s integral formula in Sec. 14.3; writing z and z* instead of zo and 
z (so that z* is the variable of integration), we have 


(6) i=. 


z lies inside C, for which we take a circle of radius r with center zg and interior in D 
(Fig. 367). We develop 1/(z* — z) in (6) in powers of z — zo. By a standard algebraic 
manipulation (worth remembering!) we first have 


1 1 1 


zz Z*— zy — (Z— Zp) Z—-Z0\_ 
o — ¢ 0) (ct za)(1 .) 


(7) 
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Fig. 367. Cauchy formula (6) 


For later use we note that since z* is on C while z is inside C, we have 


4. — £0 


(7*) < {i (Fig. 367). 


z* — Zo 
To (7) we now apply the sum formula for a finite geometric sum 


T= n+1 1 n+1 
(8*) Legg (q+ 0), 
I-q I-q l1-4@ 


which we use in the form (take the last term to the other side and interchange sides) 


n+1 


1 
(8) =) Sh pend ge 
1-q l-q 


Applying this with g = (z — zo)/(z* — Zo) to the right side of (7), we get 


1 1 pe es, Z—Z0 a ; Z— Zo \" 
ze—oz7 gk oz * ko» x 
0 z Z0 va ZO Zz Zo 


We insert this into (6). Powers of z — zg do not depend on the variable of integration z*, 
so that we may take them out from under the integral sign. This yields 


1 ok = ok 
poy pig Lohse Af PO ae 


A z* — Zo 277i (z* — Zg) 


277i mar Oe" FB) 


, & cof fZ*) 
* 
ln * — Zo) 
with R,,(z) given by (3). The integrals are those in (2) related to the derivatives, so that 
we have proved the Taylor formula (4). 
Since analytic functions have derivatives of all orders, we can take n in (4) as large as 
we please. If we let n approach infinity, we obtain (1). Clearly, (1) will converge and 
represent f(z) if and only if 


) lim Ry(2) = 0. 
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We prove (9) as follows. Since z* lies on C, whereas z lies inside C (Fig. 367), we have 
|c* — z| > 0. Since f(z) is analytic inside and on C, it is bounded, and so is the function 


F*)/@* — 2), say, 
f(2*) 


gee 


=M 


for all z* on C. Also, C has the radius r = |z* — Zol and the length 27rr. Hence by the 
ML-inequality (Sec. 14.1) we obtain from (3) 


ne Iz — zol"*? f FQ*) at 
= Z 
a 27 ae — zo) (c* = 2) 
10 
i Iz — zl”? 1 z-zo|"" 
= ar M vn 27r=M 


Now |z — Zol <r because z lies inside C. Thus |z — zol/r <1, so that the right side 
approaches 0 as n > ©. This proves that the Taylor series converges and has the sum f(z). 
Uniqueness follows from Theorem 2 in the last section. Finally, (5) follows from a, in 
(1) and the Cauchy inequality in Sec. 14.4. This proves Taylor’s theorem. i 


Accuracy of Approximation. We can achieve any preassinged accuracy in approxi- 
mating f(z) by a partial sum of (1) by choosing n large enough. This is the practical use 
of formula (9). 


Singularity, Radius of Convergence. On the circle of convergence of (1) there is at 
least one singular point of f(z), that is, a point z = c at which f(z) is not analytic 
(but such that every disk with center c contains points at which f(z) is analytic). We 
also say that f(z) is singular at c or has a singularity at c. Hence the radius of con- 
vergence R of (1) is usually equal to the distance from Zo to the nearest singular point 
of f(z). 

(Sometimes R can be greater than that distance: Ln z is singular on the negative real 
axis, whose distance from zo = —1 + is 1, but the Taylor series of Ln z with center 
Zo = —1 + ihas radius of convergence V2.) 


Power Series as Taylor Series 


Taylor series are power series—of course! Conversely, we have 


Relation to the Previous Section 


A power series with a nonzero radius of convergence is the Taylor series of its sum. 


Given the power series 


f(2) = ap + ay(z — Zo) + doz — Zo)” + a3(z — zo)? + °°. 


694 


EXAMPLE 1 


EXAMPLE 2 


CHAP. 15 Power Series, Taylor Series 


Then f(zo) = dg. By Theorem 5 in Sec. 15.3 we obtain 


f'@ = a, + 2ag(z — zo) + 3ax(z— zo)? +++:, thus f(z) = ay 


f'"@ = 2a + 3- AZ — Zz) +, thus (zo) = 2!as, 


and in general f (z9) = n!dy. With these coefficients the given series becomes the Taylor 
series of f(z) with center zo. | 


Comparison with Real Functions. One surprising property of complex analytic 
functions is that they have derivatives of all orders, and now we have discovered the other 
surprising property that they can always be represented by power series of the form (1). 
This is not true in general for real functions; there are real functions that have derivatives 
of all orders but cannot be represented by a power series. (Example: f(x) = exp (—1 /x?) 
if x # 0 and f(O) = 0; this function cannot be represented by a Maclaurin series in an 
open disk with center 0 because all its derivatives at 0 are zero.) 


Important Special Taylor Series 


These are as in calculus, with x replaced by complex z. Can you see why? (Answer. The 
coefficient formulas are the same.) 


Geometric Series 


Let f(z) = 1/(1 — 2). Then we have f(z) = n!/(. — z)"*4, f™(0) = n!. Hence the Maclaurin expansion of 
1/(1 — z) is the geometric series 


1 os 
(11) ae eect 2 pores (\zl = 1), 
ar 


n=0 
f(z) is singular at z = 1; this point lies on the circle of convergence. |_| 


Exponential Function 


We know that the exponential function e* (Sec. 13.5) is analytic for all z, and (e*)’ = e*. Hence from (1) with 
Zo = O we obtain the Maclaurin series 


oo wT 
Zz 

(12) ee ea 
nN. 


n=0 


This series is also obtained if we replace x in the familiar Maclaurin series of e*” by z. 
Furthermore, by setting z = iy in (12) and separating the series into the real and imaginary parts (see Theorem 
2, Sec. 15.1) we obtain 


Since the series on the right are the familiar Maclaurin series of the real functions cos y and sin y, this shows 
that we have rediscovered the Euler formula 


(13) e’ = cosy + isiny. 


Indeed, one may use (12) for defining e* and derive from (12) the basic properties of e*. For instance, the 
differentiation formula (e*)’ = e* follows readily from (12) by termwise differentiation. | 
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Trigonometric and Hyperbolic Functions 


By substituting (12) into (1) of Sec. 13.6 we obtain 


ea ~2n aes pes 
COS Z (-1)” 1 be aaa 
> (2n)! mM ai 
(14) 
a nel 3 3 
sinz = > (-1)” z | fees 
= (2n + 1)! aise! 


When z = x these are the familiar Maclaurin series of the real functions cos x and sin x. Similarly, by substituting 
(12) into (11), Sec. 13.6, we obtain 


= 20 2 4 
iB Gj Zz 
cosh z > 14 t Pots 
pan)! 2! 4! 
(15) oo «= pant 3B 5 
sinhz = > gt—+—te. o 
nay eae Ih See! 
Logarithm 
From (1) it follows that 
ee 
(16) In eae te (lzl < 0. 
Replacing z by —z and multiplying both sides by —1, we get 
I 2 63 
(17) Ln (1 — z) Ln z4 5 3 ee (Iz| < 1). 
By adding both series we obtain 
P+ z 2. 2 
(18) Las 2(24 5 - F --) (lzl<1). 
= 


Practical Methods 


The following examples show ways of obtaining Taylor series more quickly than by the 
use of the coefficient formulas. Regardless of the method used, the result will be the same. 
This follows from the uniqueness (see Theorem 1). 


Substitution 
Find the Maclaurin series of f(z) = 1/(1 + 2). 


Solution. By substituting —z” for z in (11) we obtain 


i e “ 
(19) YC" = J Ev" =1-24+24-2+-- (<p). 
“ n=0 


n=0 
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Integration 
Find the Maclaurin series of f(z) = arctan z. 


Solution. We have f(z) = 1/(. + 2°). Integrating (19) term by term and using f(0) = 0 we get 


~ (ie 2n+1 z 1 r4 ; 
arctan Zz x5 ag Zz 3 t 5 ress 
n 


(lzl < 1); 


n=0 
this series represents the principal value of w =u + iv = arctanz defined as that value for which 
|u| < 77/2. | 
Development by Using the Geometric Series 


Develop 1/(c — z) in powers of z — zo, where c — zo # 0. 


Solution. 
then the use of (11) with z replaced by (z — 


This was done in the proof of Theorem 1, where c = z*. The beginning was simple algebra and 


Zo)/(C — Zo): 
1 1 1 1 3 (==) 
C=2° C= 29> — 20) Z— Zo c— 20 Cc — Zo 
(¢'=Zo)| 1 = 
Cc Zo 


! jae = 20) | 
¢— 20 "C—Z C— 20 T * 


This series converges for 


lz — zol < le — zol. | 


<1, that is, 


C— Zo 
Binomial Series, Reduction by Partial Fractions 


Find the Taylor series of the following function with center zp = 1. 


227 + 92 + 5 
2+ 22 — 82-12 


f@ 


Solution. We develop f(z) in partial fractions and the first fraction in a binomial series 


1 e atgr=> "ye 


( Wi n=0 Me 
(20) 
mm+ 1), mm + In + 2) 5 
With a i 31 cca 


with m = 2 and the second fraction in a geometric series, and then add the two series term by term. This gives 


fe) 1, 2 1 2 a 1 ) 1 
“  @t22 2-3 B+t@-DP 2-G@-1 9\+3@- DP —3- 1 
he Ie 7 ‘y = (: = ‘y a r(-Dat 1) 1 m 
=o Zz 1 
Pal n 3 > 3 » | gn+2 Qn ae. 
8 31 b 23 b= i 275 bani 
9 54° 108 1944 


We see that the first series converges for |z — 1| < 3 and the second for |z — 1| < 2. This had to be expected 
because 1/(z + 2)? is singular at —2 and 2/(z — 3) at 3, and these points have distance 3 and 2, respectively, 
from the center z9 = 1. Hence the whole series converges for |z — 1] <2. | 


SEC. 15.4 Taylor and Maclaurin Series 


1. Calculus. Which of the series in this section have you 
discussed in calculus? What is new? 


2. On Examples 5 and 6. Give all the details in the 
derivation of the series in those examples. 


MACLAURIN SERIES 


Find the Maclaurin series and its radius of convergence. 


3. sin 2” * a= af 
LZ 
1 1 
"2424 “1 + 3iz 
7. cos” §z 8. sin? z 


@ ‘72 z 
9. | exp (+) dt 10. exp | exp (—1?) dt 
0 
HIGHER TRANSCENDENTAL 
FUNCTIONS 


Find the Maclaurin series by termwise integrating the 
integrand. (The integrals cannot be evaluated by the usual 
methods of calculus. They define the error function erf z, 
sine integral Si(z), and Fresnel integrals* S(z) and C(z), 
which occur in statistics, heat conduction, optics, and other 
applications. These are special so-called higher transcen- 
dental functions.) 


11-14 


z 
12. C(z) = | cos f2 dt 


z 
11. S(z) =| sin t? dt 
0 


0 
2 J * sin t 

13. erf z = —— | edt 14. Si(z) = | dt 
Vir Jy ‘ # 

15. CAS Project. sec, tan. (a) Euler numbers. The 


Maclaurin series 


21 Re ee 
(21) sec Z 0 ih rare free 


defines the Euler numbers Ezy. Show that Eo = 1, 
Eg = —1, Eq, = 5, Eg = —61. Write a program that 
computes the Ey, from the coefficient formula in (1) 
or extracts them as a list from the series. (For tables 
see Ref. [GenRef1], p. 810, listed in App. 1.) 


(b) Bernoulli numbers. The Maclaurin series 


3 4 


22 taf chet oe 
We) Fe a a gy 
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defines the Bernoulli numbers B,,. Using undetermined 
coefficients, show that 


At = JI: 
3, Bo =§, 


By =0, Be=a.""° 


B — 
(23) “+ : 
Ba = —30> 


Write a program for computing By. 
(c) Tangent. Using (1), (2), Sec. 13.6, and (22), show 
that tan z has the following Maclaurin series and 
calculate from it a table of Bo,---, Bao: 
2i 4i . 
(24) tanz = — i 
ew 1 e 


ane 2n _ 1) 


= . n-1 
2¢ " (2n)! 


16. Inverse sine. Developing 1/V 1 — 2” and integrating, 
show that 


. -()5 (35 
arcsin = T T 
Oe Ne) a ea pS 


7 
-(; 2 2): Evcoflel 245. 


2n-1 
Bon Z mu _ 


2°4:-6/7 


Show that this series represents the principal value of 
arcsin z (defined in Team Project 30, Sec. 13.7). 


17. TEAM PROJECT. Properties from Maclaurin 
Series. Clearly, from series we can compute function 
values. In this project we show that properties of 
functions can often be discovered from their Taylor or 
Maclaurin series. Using suitable series, prove the 
following. 


(a) The formulas for the derivatives of e*, cos z, sin z, 
cosh z, sinh z. and Ln (1 + z) 


(b) 5(e” + e*) = cosz 


(c) sinz # 0 for all pure imaginary z = iy # 0 


18-25| TAYLOR SERIES 


Find the Taylor series with center zg and its radius of 
convergence. 


18. 1/z, zo =i 

20. cos" z, zo = 17/2 

22. cosh (z — Ti), Zo9 = Ti 
23. (z+ i*, z=i 

25. sinh(2z— 1), Zo = i/2 


19. 1/1 — 2), 


21. sin z, 


Zo=l 
zo = 77/2 


24. ee?) Zo = iT 


4AUGUSTIN FRESNEL (1788-1827), French physicist and engineer, known for his work in optics. 
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15.5 Uniform Convergence. Optional 


DEFINITION 


EXAMPLE 1 


We know that power series are absolutely convergent (Sec. 15.2, Theorem 1) and, as 
another basic property, we now show that they are uniformly convergent. Since uniform 
convergence is of general importance, for instance, in connection with termwise integration 
of series, we shall discuss it quite thoroughly. 

To define uniform convergence, we consider a series whose terms are any complex 
functions f0(z), fi(z),° °° 


(1) Dd fn = fol) + fi@ + fl ++. 
m=0 


(This includes power series as a special case in which fj,(z) = am(z — Zo)'”.) We assume 
that the series (1) converges for all z in some region G. We call its sum s(z) and its nth 
partial sum s,,(z); thus 


Sn(Z) = foi) + AZ) +++ + fr(2)- 


Convergence in G means the following. If we pick a z = z, in G, then, by the definition 
of convergence at z1, for given € > 0 we can find an Nj(e) such that 


Is(z1) — Sn(z)| <€ for all n > Ny(€). 
If we pick a zg in G, keeping € as before, we can find an No(e) such that 
|s(z2) — Sn(z2)| < € for all n > No(e), 


and so on. Hence, given an € > 0, to each z in G there corresponds a number N,(e). This 
number tells us how many terms we need (what s,, we need) at a z to make |s(z) — sp(z)| 
smaller than €. Thus this number N,(€) measures the speed of convergence. 

Small N,(€) means rapid convergence, large N,(€) means slow convergence at the point 
z considered. Now, if we can find an N(e) larger than all these N,(e) for all z in G, we 
say that the convergence of the series (1) in G is uniform. Hence this basic concept is 
defined as follows. 


Uniform Convergence 


A series (1) with sum s(z) is called uniformly convergent in a region G if for every 
€ > 0 we can find an N = N(e), not depending on z, such that 


Is(z) — Sp(z)| <e for all n > N(e) and all z in G. 


Uniformity of convergence is thus a property that always refers to an infinite set in 
the z-plane, that is, a set consisting of infinitely many points. 


Geometric Series 


Show that the geometric series 1 + z + Zt is (a) uniformly convergent in any closed disk |z| Sr < 1, 
(b) not uniformly convergent in its whole disk of convergence |z| <1. 
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Solution. (a) For z in that closed disk we have |1 — z|=1-—r (sketch it). This implies that 
1/\1 — z| S 1/1 — nr). Hence (remember (8) in Sec. 15.4 with g = z 


+1 
zt 


2 


Mm 
z 


m=nt+1 


peti 


= 


|s(z) = Sn(2)| = 


l-r 


1=zZ 
Since r < 1, we can make the right side as small as we want by choosing n large enough, and since the right 


side does not depend on z (in the closed disk considered), this means that the convergence is uniform. 


(b) For given real K (no matter how large) and n we can always find a z in the disk lz] < 1 such that 


+1 
ze 


simply by taking z close enough to 1. Hence no single N(e) will suffice to make |s(z) — s,(z)| smaller than a 
given € > 0 throughout the whole disk. By definition, this shows that the convergence of the geometric series 
in |z| < 1 is not uniform. a 


This example suggests that for a power series, the uniformity of convergence may at most 
be disturbed near the circle of convergence. This is true: 


Uniform Convergence of Power Series 
A power series 


oo 


(2) Deze)” 


m=0 


with a nonzero radius of convergence R is uniformly convergent in every circular 
disk |z — Zol =r of radiusr < R. 


For |z — zo| S rand any positive integers n and p we have 

(3) lauedle i zo)"*? pees ap An + p(Z _ zo)" *? | = aa” ae Bae nag =. 
Now (2) converges absolutely if lz — Zol =r <R (by Theorem | in Sec. 15.2). Hence 
it follows from the Cauchy convergence principle (Sec. 15.1) that, an e > 0 being given, 
we can find an N(e) such that 


lang ret t tie + Wage Se forn > N(e) and p=1,2,---. 


From this and (3) we obtain 


anvia—zoy** +o + age = Ze)? | Se 
for all z in the disk |z — Zol =r, every n > N(e), and every p = 1, 2,---. Since N(e) is 
independent of z, this shows uniform convergence, and the theorem is proved. ia 


Thus we have established uniform convergence of power series, the basic concern of this 
section. We now shift from power series to arbitary series of variable terms and examine 
uniform convergence in this more general setting. This will give a deeper understanding 
of uniform convergence. 
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Properties of Uniformly Convergent Series 


Uniform convergence derives its main importance from two facts: 


1. If a series of continuous terms is uniformly convergent, its sum is also continuous 
(Theorem 2, below). 

2. Under the same assumptions, termwise integration is permissible (Theorem 3). 
This raises two questions: 

1. How can a converging series of continuous terms manage to have a discontinuous 
sum? (Example 2) 

2. How can something go wrong in termwise integration? (Example 3) 
Another natural question is: 

3. What is the relation between absolute convergence and uniform convergence? The 
surprising answer: none. (Example 5) 


These are the ideas we shall discuss. 


If we add finitely many continuous functions, we get a continuous function as their sum. 
Example 2 will show that this is no longer true for an infinite series, even if it converges 
absolutely. However, if it converges uniformly, this cannot happen, as follows. 


Continuity of the Sum 


Let the series 
> a@ =f +A@ ++ 
m=0 


be uniformly convergent in a region G. Let F(z) be its sum. Then if each term fy,(z) 
is continuous at a point z, in G, the function F(z) is continuous at 2}. 


Let s,,(z) be the nth partial sum of the series and R,,(z) the corresponding remainder: 
Sn =foth tro t+ fn» Ry = Inet * Inea POs 
Since the series converges uniformly, for a given e > 0 we can find an N = N(e) such that 


IRy(z)| < 5 forall a6. 


Since sy(z) is a sum of finitely many functions that are continuous at z 1, this sum is 
continuous at z1. Therefore, we can find a 6 > O such that 


Isy(z) — sy(zy)| < for all z in G for which |z — z,| < 6. 


Using F = sy + Ry and the triangle inequality (Sec. 13.2), for these z we thus obtain 


|F(z) — F(z1)| = |sw(z) + Rn(z) — [sn(z1) + Rv(zy)Il 


€ €,€ 
= Is) — sy(za)l + [Ry(2)I + [Rizal < 3 + 3 + 3° © 


This implies that F(z) is continuous at z,, and the theorem is proved. 8 
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Series of Continuous Terms with a Discontinuous Sum 


Consider the series 


2 2 2 
2g. o s a 


x* 4 t t bores (x real). 
Ltx2 (+x? (1 +223 


This is a geometric series with g = 1/(1 + x) times a factor x”. Its nth partial sum is 


2 1 1 1 
Sp(x) = x7] 14 + Fees 4 P 
P+x? (+3? (1 + x)" 


We now use the trick by which one finds the sum of a geometric series, namely, we multiply s,,(x) by 


=o S 1/0 x), 
ald) | : beer ' + : |: 
1+ x? dQ+x7* Gd +22)" 


1+ x? 


Adding this to the previous formula, simplifying on the left, and canceling most terms on the right, we obtain 


SnlX) x? ; : ‘ 
( + x2)rt1 


2 


1+ x? 
thus 


7 1 
(axl? 


S_(x) = 1 +x 


The exciting Fig. 368 “explains” what is going on. We see that if x # 0, the sum is 
s(x) = lim s(x) = 1 + x, 
no 


but for x = 0 we have s,,(0) = 1 — 1 = 0 forall n, hence s(0) = 0. So we have the surprising fact that the sum 
is discontinuous (at x = 0), although all the terms are continuous and the series converges even absolutely (its 
terms are nonnegative, thus equal to their absolute value!). 

Theorem 2 now tells us that the convergence cannot be uniform in an interval containing x = 0. We can also 
verify this directly. Indeed, for x # 0 the remainder has the absolute value 


1 
IRn(x)| = Is) — sn - 
(1 27)" 
and we see that for a given € (<1) we cannot find an N depending only on € such that |R,,| < € for all n > Me) 
and all x, say, in the intervalO SxS 1. Bi 
y 


21 0 1 « 
Fig. 368. Partial sums in Example 2 


Termwise Integration 


This is our second topic in connection with uniform convergence, and we begin with an 
example to become aware of the danger of just blindly integrating term-by-term. 
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EXAMPLE 3 


THEOREM 3 


PROOF 


CHAP. 15 Power Series, Taylor Series 


Series for Which Termwise Integration Is Not Permissible 


mx 


- 2 . . 
Let Um(x) = mxe and consider the series 


D fms) where fin®) = U(X) — U1) 
m=0 


in the interval 0 S x S 1. The nth partial sum is 


Syn = Uy — Ug + Ug — Uy tere + Uy — Un—1 = Un — Up = Un. 


Hence the series has the sum F(x) = tim Sy(xX) = tim Uy(x) =0 (0 Sx S 1). From this we obtain 


1 
| F(x) dx = 0. 
0 


On the other hand, by integrating term by term and using fy + fg +--: + fp = Sn, we have 


n i 


oo 1 1 
> | finlx) dx = im | fin(x) de = Tim | Sp (x) dx. 
=1 °0 0 0 


m m=1 


Now Sy, = Uy and the expression on the right becomes 


1 1 
1 
lim | u,(x) dx = lim { nxe7"™ dx = lim —(1 e”") . 
noe | noe | ne 9 2 


but not 0. This shows that the series under consideration cannot be integrated term by term from x = 0 to 


x=1. 


The series in Example 3 is not uniformly convergent in the interval of integration, and 
we shall now prove that in the case of a uniformly convergent series of continuous 


functions we may integrate term by term. 


Termwise Integration 
Let 


F@ = > fm = fol) + A@ +--- 


m=0 


any path in G. Then the series 


co 


(4) = | Sm(z) dz = | fo(z) dz + | file) dz +++ 


m=0 ~C Cc c 


is convergent and has the sum | F(z) dz. 
Cc 


be a uniformly convergent series of continuous functions in a region G. Let C be 


From Theorem 2 it follows that F(z) is continuous. Let s,,(z) be the nth partial sum of the 
given series and R,,(z) the corresponding remainder. Then F = s,, + R, and by integration, 


| F(zjdz= | Sn(z) dz + | R,(Z) dz. 


Cc C C 
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THEOREM 4 


THEOREM 5 


Let L be the length of C. Since the given series converges uniformly, for every given 
€ > 0 we can find a number N such that IR,,(z)| < e/L for all n > N and all z in G. By 
applying the ML-inequality (Sec. 14.1) we thus obtain 


<FL=e for alln > N. 


| Ry(Z) dz 


Cc 


Since R, = F — Sy, this means that 


| F(z) dz — | Sp(Zz) dz| <€ for alln > N. 
Cc C 
Hence, the series (4) converges and has the sum indicated in the theorem. ia] 


Theorems 2 and 3 characterize the two most important properties of uniformly convergent 
series. Also, since differentiation and integration are inverse processes, Theorem 3 implies 


Termwise Differentiation 


Let the series fo(z) + f(z) + fo(z) + +++ be convergent in a region G and let F(z) 
be its sum. Suppose that the series fo(z) + fi(z) + faz) + +++ converges uniformly 
in G and its terms are continuous in G. Then 


FQ =fo+fAi@+hR@ t+: for all z in G. 


Test for Uniform Convergence 


Uniform convergence is usually proved by the following comparison test. 


Weierstrass> M-Test for Uniform Convergence 


Consider a series of the form (1) in a region G of the z-plane. Suppose that one can 
find a convergent series of constant terms, 


(5) My + My, + Mo+-:-, 


such that | fy(z)| S Mm, for all z in G and every m = 0, 1,-++. Then (1) is uniformly 
convergent in G. 


The simple proof is left to the student (Team Project 18). 


°KARL WEIERSTRASS (1815-1897), great German mathematician, who developed complex analysis based 
on the concept of power series and residue integration. (See footnote in Section 13.4.) He put analysis on a 
sound theoretical footing. His mathematical rigor is so legendary that one speaks Weierstrassian rigor. (See 
paper by Birkhoff and Kreyszig, 1984 in footnote in Sec. 5.5; Kreyszig, E., On the Calculus, of Variations and 
Its Major Influences on the Mathematics of the First Half of Our Century. Part Il, American Mathematical 
Monthly (1994), 101, No. 9, pp. 902-908). Weierstrass also made contributions to the calculus of variations, 
approximation theory, and differential geometry. He obtained the concept of uniform convergence in 1841 
(published 1894, sic’); the first publication on the concept was by G. G. STOKES (see Sec 10.9) in 1847. 
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EXAMPLE 4 


EXAMPLE 5 


CHAP. 15 Power Series, Taylor Series 


Weierstrass M-Test 
Does the following series converge uniformly in the disk |z| S 1? 
bad ze +1 
moi m= + cosh mlz| , 


Solution. Uniform convergence follows by the Weierstrass M-test and the convergence of D1/m? (see 
Sec. 15.1, in the proof of Theorem 8) because 


| gti lz +1 
= 
m? + cosh mliz| m2 
ss r] 
m2 


No Relation Between Absolute 
and Uniform Convergence 


We finally show the surprising fact that there are series that converge absolutely but not 
uniformly, and others that converge uniformly but not absolutely, so that there is no relation 
between the two concepts. 


No Relation Between Absolute and Uniform Convergence 


The series in Example 2 converges absolutely but not uniformly, as we have shown. On the other hand, the series 


> ; ; ; rs bvee (x real) 


converges uniformly on the whole real line but not absolutely. 

Proof. By the familiar Leibniz test of calculus (see App. A3.3) the remainder R, does not exceed its first 
term in absolute value, since we have a series of alternating terms whose absolute values form a monotone 
decreasing sequence with limit zero. Hence given € > 0, for all x we have 


1 1 . 1 
IRn(x)| = —<e ifa>N@ =. 


3 a < 
x’ +n+tl on 


This proves uniform convergence, since N(€) does not depend on x. 
The convergence is not absolute because for any fixed x we have 


Aa 1 
x? +m x? +m 
k 
> = 
m 
where k is a suitable constant, and k21/m diverges. B 


PROBLEM SET 15-5 


1. CAS EXPERIMENT. Graphs of Partial Sums. (a) (b) Power series. Study the nonuniformity of con- 
Fig. 368. Produce this exciting figure using your CAS. vergence experimentally by graphing partial sums near 
Add further curves, say, those of s256, 51024, etc. on the the endpoints of the convergence interval for real 


Same screen. 


Zax. 


SEC. 15.5 Uniform Convergence. 


Optional 


2-9 


POWER SERIES 


Where does the power series converge uniformly? Give 
reason. 


3 (24) 
Tn —3 7 
az ta" 


wel _ i)” 
n! 


(z — i)” 


ae. (; ic + 2i)" 


. > 2"(tanh n?) 27” 


10-17 


UNIFORM CONVERGENCE 


Prove that the series converges uniformly in the indicated 
region. 


12. 


13. 


14. 


15. 


16. 


17. 


18. 


oc 
>» n® cosh n|z| , 


sin” |z| 


oo 
> 2 2 


all z 


TEAM PROJECT. Uniform Convergence. 
(a) Weierstrass M-test. Give a proof. 
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(b) Termwise differentiation. Derive Theorem 4 
from Theorem 3. 


(c) Subregions. Prove that uniform convergence of a 
series in a region G implies uniform convergence in 
any portion of G. Is the converse true? 


(d) Example 2. Find the precise region of convergence 
of the series in Example 2 with x replaced by a complex 
variable z. 


(e) Figure 369. Show that x7 3%, (1 + x2)7™ = 1 
if x # 0 and Oif x = 0. Verify by computation that the 
partial sums 51, 59, 53 look as shown in Fig. 369. 


Fig. 369. 


Sum s and partial 
sums in Team Project 18(e) 


HEAT EQUATION 


Show that (9) in Sec. 12.6 with coefficients (10) is a solution 
of the heat equation for t > 0, assuming that f(x) is 
continuous on the interval 0 S x = L and has one-sided 
derivatives at all interior points of that interval. Proceed as 
follows. 


19. 


20. 


Show that |B,,| is bounded, say |B,| < K for all n. 
Conclude that 


lunl < Ke7 nto if t=to>0 

and, by the Weierstrass test, the series (9) converges 
uniformly with respect tox andtfort 2 f9,0 Sx SL. 
Using Theorem 2, show that u(x, f) is continuous for 
t 2 fo and thus satisfies the boundary conditions (2) 
for t 2 fo. 


Show that |du,/dt| < dN Ke7 dito if t 2 fo and the 
series of the expressions on the right converges, by 
the ratio test. Conclude from this, the Weierstrass 
test, and Theorem 4 that the series (9) can be 
differentiated term by term with respect to ¢ and the 
resulting series has the sum du/dt. Show that (9) can 
be differentiated twice with respect to x and the 
resulting series has the sum d7u/dx”. Conclude from 
this and the result to Prob. 19 that (9) is a solution 
of the heat equation for all t = tg. (The proof that (9) 
satisfies the given initial condition can be found in 
Ref. [C10] listed in App. 1.) 
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1. What is convergence test for series? State two tests from 
memory. Give examples. 

2. What is a power series? Why are these series very 
important in complex analysis? 

3. What is absolute convergence? Conditional convergence? 
Uniform convergence? 

4. What do you know about convergence of power series? 

5. What is a Taylor series? Give some basic examples. 

6. What do you know about adding and multiplying power 
series? 

7. Does every function have a Taylor series development? 
Explain. 

8. Can properties of functions be discovered from 
Maclaurin series? Give examples. 

9. What do you know about termwise integration of 
series? 

10. How did we obtain Taylor’s formula from Cauchy’s 

formula? 


11-15 | RADIUS OF CONVERGENCE 


Find the radius of convergence. 


uu. Baer 
* Beare 
1) 
13. ea g=)" 
n=2 
14. De 30 
n=1 


SUMMARY—OF- CHAPTER 5 


Power Series, Taylor Series 


CHAP. 15 Power Series, Taylor Series 


CHAP TER—-15-REVIEW-QUESTIONS AND PROBLEMS 


16-20 | RADIUS OF CONVERGENCE 


Find the radius of convergence. Try to identify the sum of 
the series as a familiar function. 


0 Zz CJ gi 
16. > > 7. >, Ties 
n=1 n=0 
2n+1 
8. D Ga yl 7 
n=0 
0 ze CJ dd 
19. 20. S. ——— 
= (2n)! > (3 + 4)” 


21-25 | MACLAURIN SERIES 

Find the Maclaurin series and its radius of convergence. 
Show details. 
21. (sinh z?)/z? 
23. cos” z 


25. —(exp/(—2”) — 1)/z” 
26-30 | TAYLOR SERIES 


Find the Taylor series with the given point as center and its 
radius of convergence. 

26. 24, i 

27. cos z, 37 

28. 1/z,  2i 

29. Lnz, 3 

30. e*, Ti 


22. 1/1 - 28 
24. 1/(mz + 1) 


is of the form (Sec. 15.2) 


i} 


n=0 


Sequences, series, and convergence tests are discussed in Sec. 15.1. A power series 


(1) S) an(z — 20)" = a9 + az - 


Zo is its center. The series (1) converges for |z — zg| <R and diverges for 
lz — zol > R, where R is the radius of convergence. Some power series converge 


zo) + doz — zo) +°°°5 
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for all z (then we write R = ~). In exceptional cases a power series may converge 
only at the center; such a series is practically useless. Also, R = lim |a,/a,41| 
if this limit exists. The series (1) converges absolutely (Sec. 15.2) and uniformly 
(Sec. 15.5) in every closed disk lz — Zol =r<R(R > 0). It represents an analytic 
function f(z) for |z — zo| < R. The derivatives f'(z), f’(z),:++ are obtained by 
termwise differentiation of (1), and these series have the same radius of convergence 
R as (1). See Sec. 15.3. 

Conversely, every analytic function f(z) can be represented by power series. These 
Taylor series of f(z) are of the form (Sec. 15.4) 


— 1 
(2) fO= DAPI - zo)" (Iz — zo) < R), 
n=0 ~ 


as in calculus. They converge for all z in the open disk with center zg and radius 
generally equal to the distance from Zo to the nearest singularity of f(z) (point at 
which f(z) ceases to be analytic as defined in Sec. 15.4). If f(z) is entire (analytic 
for all z; see Sec. 13.5), then (2) converges for all z. The functions e*, cos z, sin z, 
etc. have Maclaurin series, that is, Taylor series with center 0, similar to those in 
calculus (Sec. 15.4). 


CHAPTER | 6 


Laurent Series. 
Residue Integration 


The main purpose of this chapter is to learn about another powerful method for evaluating 
complex integrals and certain real integrals. It is called residue integration. Recall that 
the first method of evaluating complex integrals consisted of directly applying Cauchy’s 
integral formula of Sec. 14.3. Then we learned about Taylor series (Chap. 15) and will 
now generalize Taylor series. The beauty of residue integration, the second method of 
integration, is that it brings together a lot of the previous material. 

Laurent series generalize Taylor series. Indeed, whereas a Taylor series has positive 
integer powers (and a constant term) and converges in a disk, a Laurent series (Sec. 16.1) 
is a series of positive and negative integer powers of z — Zo and converges in an annulus 
(a circular ring) with center zo. Hence, by a Laurent series, we can represent a given 
function f(z) that is analytic in an annulus and may have singularities outside the ring as 
well as in the “hole” of the annulus. 

We know that for a given function the Taylor series with a given center zo is unique. 
We shall see that, in contrast, a function f(z) can have several Laurent series with the 
same center Zg and valid in several concentric annuli. The most important of these series 
is the one that converges for 0 < lz = Zol < R, that is, everywhere near the center zg 
except at zg itself, where zo is a singular point of f(z). The series (or finite sum) of the 
negative powers of this Laurent series is called the principal part of the singularity of 
f(z) at Zo, and is used to classify this singularity (Sec. 16.2). The coefficient of the power 
1/(z — Zo) of this series is called the residue of f(z) at zo. Residues are used in an elegant 
and powerful integration method, called residue integration, for complex contour integrals 
(Sec. 16.3) as well as for certain complicated real integrals (Sec. 16.4). 


Prerequisite: Chaps. 13, 14, Sec. 15.2. 
Sections that may be omitted in a shorter course: 16.2, 16.4. 
References and Answers to Problems: App. | Part D, App. 2. 


16.1 Laurent Series 
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Laurent series generalize Taylor series. If, in an application, we want to develop a function 
F(Z in powers of z — Zp when f(z) is singular at zp (as defined in Sec. 15.4), we cannot 
use a Taylor series. Instead we can use a new kind of series, called Laurent series, | 


1PIERRE ALPHONSE LAURENT (1813-1854), French military engineer and mathematician, published the 
theorem in 1843. 
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THEOREM 1 


consisting of positive integer powers of z — zg (and a constant) as well as negative integer 
powers of z — Zo; this is the new feature. 

Laurent series are also used for classifying singularities (Sec. 16.2) and in a powerful 
integration method (“residue integration,” Sec. 16.3). 

A Laurent series of f(z) converges in an annulus (in the “hole” of which f(z) may have 
singularities), as follows. 


Laurent’s Theorem 


Let f(z) be analytic in a domain containing two concentric circles Cy and Cz with 
center Zg and the annulus between them (blue in Fig. 370). Then f(z) can be 
represented by the Laurent series 


f@ = Dane =o) a Sa 


ae oP 
(1) = ay + ay(z — 20) + aa(z — zo)" + °° 
by bo 


_+ + 3 
= eG) (Z — Zo) 


consisting of nonnegative and negative powers. The coefficients of this Laurent series 
are given by the integrals 


ok 
(2) aq,= : ; IG a dz*, by = ane ; (z* — zo)” 1 f(z*) dz*, 


*k 
ami J, & Zo) 27 J, 


taken counterclockwise around any simple closed path C that lies in the annulus 
and encircles the inner circle, as in Fig. 370. [The variable of integration is denoted 
by z* since z is used in (1).] 

This series converges and represents f(z) in the enlarged open annulus obtained 
from the given annulus by continuously increasing the outer circle C, and decreasing 
Cz until each of the two circles reaches a point where f(z) is singular. 

In the important special case that zg is the only singular point of f(z) inside Co, 
this circle can be shrunk to the point zo, giving convergence in a disk except at the 
center. In this case the series (or finite sum) of the negative powers of (1) is called 
the principal part of f(z) at zo [or of that Laurent series (1)]. 


CQ, 


Cy 


Fig. 370. Laurent’s theorem 
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PROOF 


CHAP. 16 Laurent Series. Residue Integration 


COMMENT. Obviously, instead of (1), (2) we may write (denoting b, by a_n) 


(1’) JOS OG any 


N=—% 


where all the coefficients are now given by a single integral formula, namely, 


(2’) an — 


* 
: ; nee dz* (a = 0, +1, =2,-++). 


$ n+1 
277i e (> = ay) 
Let us now prove Laurent’s theorem. 


(a) The nonnegative powers are those of a Taylor series. 

To see this, we use Cauchy’s integral formula (3) in Sec. 14.3 with z* (instead of z) as 
the variable of integration and z instead of zo. Let g(z) and A(z) denote the functions 
represented by the two terms in (3), Sec. 14.3. Then 


* 
(3) f@ = e@) +h = ee ; END ae 


i ‘i 277i ita 
c, 2 Zz c, = Zz 


Here z is any point in the given annulus and we integrate counterclockwise over both C; 
and Cy, so that the minus sign appears since in (3) of Sec. 14.3 the integration over Cg 
is taken clockwise. We transform each of these two integrals as in Sec. 15.4. The first 
integral is precisely as in Sec. 15.4. Hence we get exactly the same result, namely, the 
Taylor series of g(z), 


dz* = > anl(z — Z0)” 
n=0 


I FR") = 
a) w= so 


Ti Russ 
c,* Zz 


with coefficients [see (2), Sec. 15.4, counterclockwise integration] 


ok 
(5) a | FEY ey 
aa 


7 . +1 
Ti z* — zo)” 


Here we can replace C by C (see Fig. 370), by the principle of deformation of path, since 
Zo, the point where the integrand in (5) is not analytic, is not a point of the annulus. This 
proves the formula for the a,, in (2). 


(b) The negative powers in (1) and the formula for b,, in (2) are obtained if we consider 
h(z). It consists of the second integral times —1/(27ri) in (3). Since z lies in the annulus, 
it lies in the exterior of the path Cy. Hence the situation differs from that for the first 
integral. The essential point is that instead of [see (7*) in Sec. 15.4] 

Z— Zo z* — Zo 


(6) (a) ZA; 


<1 we now have (b) 


z* — zq £— £0 
Consequently, we must develop the expression 1/(z* — z) in the integrand of the second 
integral in (3) in powers of (z* — zo)/(z — Zo) (instead of the reciprocal of this) to get a 
convergent series. We find 


SEC. 16.1 


Laurent Series 711 


1 | 1 -1 


ze—z z* — zg — (z — Zo) i z*—7z\ 
(z — Zo) f= aa 


Compare this for a moment with (7) in Sec. 15.4, to really understand the difference. Then 
go on and apply formula (8), Sec. 15.4, for a finite geometric sum, obtaining 


> 
1 1 z* — Zo 2° 26 z*— zg\" 
z 14 + feet 
Le Z— Zo Z— Zo Z— Zo Z— Zo 
nt+1 


1 z* — Zo 
ZS ZN 2 = Zo 


Multiplication by —f(z*)/277i and integration over Cz on both sides now yield 


A(z) = 


ores 1 7 
Qari {; — Zo f slen dz* + pao f (z* — zo) f(z*) dz* + 


—— 4 (2 — zo)"""f@*) dz* 
(Z — Zo) Jo, 
1 


(z _ zur 


; (* = Zo) F(*) ace} + REZ) 
Ce 


with the last term on the right given by 


1 ; ese) ali 
2ai(z — zo)"** 


(7) Ri = f(2*) dz*®. 


Gg 2-2 


As before, we can integrate over C instead of C2 in the integrals on the right. We see that 
on the right, the power 1/(z — zo)” is multiplied by b, as given in (2). This establishes 
Laurent’s theorem, provided 


(8) lim Rp(z) = 0. 


(c) Convergence proof of (8). Very often (1) will have only finitely many negative powers. 
Then there is nothing to be proved. Otherwise, we begin by noting that f(z*)/(z — z*) in (7) 
is bounded in absolute value, say, 


ie) 


ZZ 


<M for all z* on Co 


because f(z*) is analytic in the annulus and on Co, and z* lies on Cy and z outside, so 
that z — z* # 0. From this and the ML-inequality (Sec. 14.1) applied to (7) we get the 
inequality (L = 27rs = length of Cy, ro = |z* — zo| = radius of Cz = const) 
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EXAMPLE 1 


EXAMPLE 2 


CHAP. 16 Laurent Series. Residue Integration 


. 1 a ML n+1 
[Ri(z)| S$ ——————— r 3) ML = (2 ) 
lz = zo 


From (6b) we see that the expression on the right approaches zero as n approaches infinity. 
This proves (8). The representation (1) with coefficients (2) is now established in the given 
annulus. 


(d) Convergence of (1) in the enlarged annulus. The first series in (1) is a Taylor 
series [representing g(z)]; hence it converges in the disk D with center zg whose radius 
equals the distance of the singularity (or singularities) closest to zg. Also, g(z) must be 
singular at all points outside Cy where f(z) is singular. 

The second series in (1), representing /(z), is a power series in Z = 1/(z — Zo). Let the 
given annulus be ra < |z — zo| < ry, where ry and rg are the radii of C, and Cy, respectively 
(Fig. 370). This corresponds to 1/rg > |Z| > 1/ry. Hence this power series in Z must 
converge at least in the disk IZ} <1 /rz. This corresponds to the exterior Iz —- Zol > re of 
Cy, so that h(z) is analytic for all z outside Cy. Also, h(z) must be singular inside Cy 
where f(z) is singular, and the series of the negative powers of (1) converges for all z 
in the exterior E of the circle with center zo and radius equal to the maximum distance 
from Zo to the singularities of f(z) inside Cy. The domain common to D and E is the 
enlarged open annulus characterized near the end of Laurent’s theorem, whose proof 
is now complete. |_| 


Uniqueness. The Laurent series of a given analytic function f(z) in its annulus of 
convergence is unique (see Team Project 18). However, f(z) may have different Laurent 
series in two annuli with the same center; see the examples below. The uniqueness is 
essential. As for a Taylor series, to obtain the coefficients of Laurent series, we do not 
generally use the integral formulas (2); instead, we use various other methods, some of 
which we shall illustrate in our examples. If a Laurent series has been found by any such 
process, the uniqueness guarantees that it must be the Laurent series of the given function 
in the given annulus. 


Use of Maclaurin Series 
Find the Laurent series of z~° sin z with center 0. 


Solution. By (14), Sec. 15.4, we obtain 


= 
= 


z > sin z S <p" gen=4 : : 2 4 tee (Iz| > 0) 
. ; (2n + 1)! zt 62 120-5040 : 


n=0 


Here the “annulus” of convergence is the whole complex plane without the origin and the principal part of the 
series at 0 is z~* — 7 aa 


Substitution 


Find the Laurent series of z2e/* with center 0. 


Solution. From (12) in Sec. 15.4 with z replaced by 1/z we obtain a Laurent series whose principal part is 
an infinite series, 


; 1 1 ti 1 
elz a(i4 ---) 2+zt+—4 te (|| > 0). 


SEC. 16.1 Laurent Series 


EXAMPLE 3 


EXAMPLE 4 


EXAMPLE 5 


Development of 1/(1 — z) 


Develop 1/(1 — z) (a) in nonnegative powers of z, (b) in negative powers of z. 


Solution. 


(a) = > 2" 


(b) = n+1 


N 
N 
N 


Laurent Expansions in Different Concentric Annuli 
Find all Laurent series of 1/(z? — z*) with center 0. 


Solution. Multiplying by 1/ 23, we get from Example 3 


1 = 1 1 1 


5) so = DMO Kat atc tit: 
| aay 4 n=0 Zz Zz z 

1 7 1 1 1 

3_ 4 >> nt+4 4 5 
Zz z n=0 Z Zz z 
Use of Partial Fractions 
—2z + 3 
Find all Taylor and Laurent series of f(z) = oS, ah with center 0. 
go> 3242 


Solution. 1n terms of partial fractions, 


1 1 =. 1 
(c) z-2 1 > grt a" 
- a(t = 2) n=0 
2 
1 1 = 28 
(d) = A 2a 


(1) From (a) and (c), valid for |z| < 1 (see Fig. 371), 


~ n 1 n 3,5 I 2 
fe) > (1 an): tet ge 


n=0 


Fig. 371. Regions of convergence in Example 5 
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(valid if |z| < 1). 


(validif|z|] > 1). 


<2 <1, 


(zl >). 


(Izl < 2), 


(\zl > 2). 
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(IL) From (c) and (b), valid for 1 < |z| < 2, 


CHAP. 16 Laurent Series. Residue Integration 


f@) 


(III) From (d) and (b), valid for |z| > 2, 


Be SU) te ee ll Lt ».4'% 1 1 
z F ae <n ia al 
= gnti > gett 2 4 8 Z 2 
2 2 3 5 9 
fO=--LDO+) aa=-l-a- Bos al 


n=0 


If f(z) in Laurent’s theorem is analytic inside Cy, the coefficients b,, in (2) are zero by 
Cauchy’s integral theorem, so that the Laurent series reduces to a Taylor series. Examples 


3(a) and 5(J) illustrate this. 


PROBLEEM—SET 16-1 


1-8 | LAURENT SERIES NEAR A SINGULARITY 


ATO 
Expand the function in a Laurent series that converges for 
0 < |z| < R and determine the precise region of conver- 
gence. Show the details of your work. 


‘ COS Z 5 exp (-1/z?) 
5 A i 2 
exp 2 sin 77z 
3; 3 4. 5 
Zz Zz 
1 sinh 2z 
5. = 3 6. ri 
: 4 Zz 
1 = 
ds 2? cosh = 8. 
ae 2 _ 2 


9-16 | LAURENT SERIES NEAR A SINGULARITY 


AT Zo 


Find the Laurent series that converges for0 < |z — zol <R 
and determine the precise region of convergence. Show details. 


z 2 ‘ 
=3 
9. ——, w=1 10. ~~, x =3 
(z= 1) (z — 3) 
2 
Zz 1 
11. ry Zo = Ti 12. a et Zo=1 
(z — Tri) ZZ >a) 
1 az 
13.=— , w=i MW, w=b 
2 (2 — i z—b 
COS Z 
i? 2 Z0 >= 7 
ZT 
sin 
16. ——, w=he 
(< — 477) 


17. CAS PROJECT. Partial Fractions. Write a program 
for obtaining Laurent series by the use of partial 
fractions. Using the program, verify the calculations in 
Example 5 of the text. Apply the program to two other 
functions of your choice. 

TEAM PROJECT. Laurent Series. (a) Uniqueness. 
Prove that the Laurent expansion of a given analytic 
function in a given annulus is unique. 


18. 


(b) Accumulation of singularities. Does tan (1/z) 
have a Laurent series that converges in a region 
0 < |z| < R? (Give a reason.) 
(c) Integrals. Expand the following functions in a 
Laurent series that converges for |z| > 0: 

1 | 1 1 | 


0 0 


ae * sin t 
| at 


= dt, 5 


19-25) TAYLOR AND LAURENT SERIES 


Find all Taylor and Laurent series with center zo. Determine 
the precise regions of convergence. Show details. 


1 


19. 
1-2 


> Zo = 0 


sin z 


21. 


, 
ztaq 


~. 


1 
22. >, Zo 


N 


24, ——— 


25. 
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16.2 Singularities and Zeros. Infinity 


EXAMPLE 1 


Roughly, a singular point of an analytic function f(z) is a zg at which f(z) ceases to be 
analytic, and a zero is a z at which f(z) = 0. Precise definitions follow below. In this 
section we show that Laurent series can be used for classifying singularities and Taylor 
series for discussing zeros. 

Singularities were defined in Sec. 15.4, as we shall now recall and extend. We also 
remember that, by definition, a function is a single-valued relation, as was emphasized 
in Sec. 13.3. 

We say that a function f(z) is singular or has a singularity at a point z = Zo if f(z) is not 
analytic (perhaps not even defined) at z = zo, but every neighborhood of z = zg contains 
points at which f(z) is analytic. We also say that z = zo is a singular point of f(z). 

We call z = Zo an isolated singularity of f(z) if z = zo has a neighborhood without 
further singularities of f(z). Example: tan z has isolated singularities at +7r/2, +377/2, etc.; 
tan (1/z) has a nonisolated singularity at 0. (Explain!) 

Isolated singularities of f(z) at z = zo can be classified by the Laurent series 

co oo by 
(1) f2 = > an — zo)" + } —— (Sec. 16.1) 


n=0 n=1 & a Zo)" 


valid in the immediate neighborhood of the singular point z = Zo, except at Zg itself, that 
is, in a region of the form 


0< |z=29|< 8 


The sum of the first series is analytic at z = zg, as we know from the last section. The 
second series, containing the negative powers, is called the principal part of (1), as we 
remember from the last section. If it has only finitely many terms, it is of the form 


by bin 


+ inte + ————__ 
Z— 29 (z — Zo)" 


(2) (bm # 0). 


Then the singularity of f(z) at z = Zo is called a pole, and m is called its order. Poles of 
the first order are also known as simple poles. 

If the principal part of (1) has infinitely many terms, we say that f(z) has at z = zp an 
isolated essential singularity. 

We leave aside nonisolated singularities. 


Poles. Essential Singularities 


The function 


ee 
” az— 2)? (¢- 2) 


has a simple pole at z = 0 and a pole of fifth order at z = 2. Examples of functions having an isolated essential 
singularity at z = 0 are 


ra | 
1 1 ! n 
ea Yop te sgt 
n=0'5"~ ee 
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EXAMPLE 2 


THEOREM 1 


EXAMPLE 3 


THEOREM 2 


CHAP. 16 Laurent Series. Residue Integration 


and 


= (-1)" 1 oo1 1 
sin 
z = (Qn + 1yiz™*>  z  3iz8 


5 sin z has a fourth-order 


Section 16.1 provides further examples. In that section, Example | shows that z~ 
pole at 0. Furthermore, Example 4 shows that 1/ (c3 — 24) has a third-order pole at 0 and a Laurent series with 
infinitely many negative powers. This is no contradiction, since this series is valid for |z| > 1; it merely tells 
us that in classifying singularities it is quite important to consider the Laurent series valid in the immediate 


neighborhood of a singular point. In Example 4 this is the series (I), which has three negative powers. a] 


The classification of singularities into poles and essential singularities is not merely a formal 
matter, because the behavior of an analytic function in a neighborhood of an essential 
singularity is entirely different from that in the neighborhood of a pole. 


Behavior Near a Pole 


f@ = 1/2” has a pole at z = 0, and If@| — © as z — 0 in any manner. This illustrates the following 
theorem. ea 


Poles 


If f(z) is analytic and has a pole at z = Zo, then | f(z)| > % as z— Zo in any manner. 


The proof is left as an exercise (see Prob. 24). 


Behavior Near an Essential Singularity 


The function f(z) = e!/* has an essential singularity at z = 0. It has no limit for approach along the imaginary 
axis; it becomes infinite if z — 0 through positive real values, but it approaches zero if z — 0 through negative real 
values. It takes on any given value c = coe’ # 0 in an arbitrarily small e-neighborhood of z = 0. To see the 
latter, we set z = re”, and then obtain the following complex equation for r and 6, which we must solve: 


ellz = efos 6-7 sin 6)/r = coe”. 
Equating the absolute values and the arguments, we have e°©°S /” = co, that is 
cos 6 = rlnco, and —sin 6 = ar 


respectively. From these two equations and cos” 6 + sin? @ = r2(Inco)* + ar? = 1 we obtain the formulas 


a 


1 
ae ia cto and tan @ = — : 
(In co)* + @& In co 


Hence r can be made arbitrarily small by adding multiples of 277 to a, leaving c unaltered. This illustrates the 
very famous Picard’s theorem (with z = 0 as the exceptional value). ia] 


Picard’s Theorem 


If f(z) is analytic and has an isolated essential singularity at a point Z9, it takes on 
every value, with at most one exceptional value, in an arbitrarily small €-neighborhood 


of Zo- 


For the rather complicated proof, see Ref. [D4], vol. 2, p. 258. For historical information 
on Picard, see footnote 9 in Problem Set 1.7. 
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EXAMPLE 4 


THEOREM 3 


PROOF 


THEOREM 4 


Removable Singularities. We say that a function f(z) has a removable singularity at 
Z = Zo if f(z) is not analytic at z = zo, but can be made analytic there by assigning a 
suitable value f(zo). Such singularities are of no interest since they can be removed as 
just indicated. Example: f(z) = (sin z)/z becomes analytic at z = 0 if we define f(0) = 1. 


Zeros of Analytic Functions 


A zero of an analytic function f(z) in a domain D is a z = Zg in D such that f(zo) = 0. 
A zero has order n if not only f but also the derivatives f’, f”,-+-,f ™—D are all 0 at z = zo 
but f oO) # 0. A first-order zero is also called a simple zero. For a second-order zero, 
f(Zo) = f'(Zo) = 0 but f"(zo) # 0. And so on. 


Zeros 


The function 1 + z? has simple zeros at +i. The function (1 — zy has second-order zeros at +1 and +i. The 
function (z — a> has a third-order zero at z = a. The function e* has no zeros (see Sec. 13 5). The function sin z 
has simple zeros at 0, +77, +277,---, and sin? z has second-order zeros at these points. The function 1 — cos z has 
second-order zeros at 0, +277, +477,---, and the function (1 — cos 2 has fourth-order zeros at these points. ie] 


Taylor Series at a Zero. At an nth-order zero z = Zo of f(z), the derivatives f’(zg),---, 
f camer are zero, by definition. Hence the first few coefficients dg, «++ , a,—1 of the Taylor 
series (1), Sec. 15.4, are zero, too, whereas a,, # 0, so that this series takes the form 


(3) F@ = ane — 20)" + ansi@ — zo)? + 
= (% — Zo)" [dn + dnsiZ — Zo) + Gns2aZ — Zo)” + +7] Gn # 0): 
This is characteristic of such a zero, because, if f(z) has such a Taylor series, it has an 


nth-order zero at z = Zo, as follows by differentiation. 
Whereas nonisolated singularities may occur, for zeros we have 


Zeros 


The zeros of an analytic function f(z) (# 0) are isolated; that is, each of them has 
a neighborhood that contains no further zeros of f(z). 


The factor (z — zo)” in (3) is zero only at z = zg. The power series in the brackets [--- ] 
represents an analytic function (by Theorem 5 in Sec. 15.3), call it g(z). Now 
8(Zo) = ay # 0, since an analytic function is continuous, and because of this continuity, 
also g(z) # 0 in some neighborhood of z = zg. Hence the same holds of f(z). | 


This theorem is illustrated by the functions in Example 4. 

Poles are often caused by zeros in the denominator. (Example: tan z has poles where 
cos zis zero.) This is a major reason for the importance of zeros. The key to the connection 
is the following theorem, whose proof follows from (3) (see Team Project 12). 


Poles and Zeros 


Let f(z) be analytic at z = zo and have a zero of nth order at z = Zo. Then 1/f(z) 
has a pole of nth order at z = zo; and so does h(z)/f(z), provided h(z) is analytic 
at Z = Zg and h(zg) # 0. 
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EXAMPLE 5 
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Fig. 372. Riemann sphere 


Riemann Sphere. Point at Infinity 


When we want to study complex functions for large |z|, the complex plane will generally 
become rather inconvenient. Then it may be better to use a representation of complex numbers 
on the so-called Riemann sphere. This is a sphere S of diameter 1 touching the complex 
z-plane at z = 0 (Fig. 372), and we let the image of a point P (a number z in the plane) be 
the intersection P* of the segment PN with S, where N is the “North Pole” diametrically 
opposite to the origin in the plane. Then to each z there corresponds a point on S. 

Conversely, each point on S represents a complex number z, except for N, which does 
not correspond to any point in the complex plane. This suggests that we introduce an 
additional point, called the point at infinity and denoted ~ (“infinity”) and let its image 
be N. The complex plane together with © is called the extended complex plane. The 
complex plane is often called the finite complex plane, for distinction, or simply the 
complex plane as before. The sphere S is called the Riemann sphere. The mapping of 
the extended complex plane onto the sphere is known as a stereographic projection. 
(What is the image of the Northern Hemisphere? Of the Western Hemisphere? Of a straight 
line through the origin?) 


Analytic or Singular at Infinity 


If we want to investigate a function f(z) for large |z|, we may now set z = 1/w and investigate 
f(z) = fA/w) = gw) ina neighborhood of w = 0. We define f(z) to be analytic or singular 
at infinity if g(w) is analytic or singular, respectively, at w = 0. We also define 


(4) g(0) = lim gw) 


if this limit exists. 
Furthermore, we say that f(z) has an nth-order zero at infinity if f(1/w) has such a zero 
at w = 0. Similarly for poles and essential singularities. 


Functions Analytic or Singular at Infinity. Entire and Meromorphic Functions 


The function f(z) = 1/22 is analytic at © since g(w) = f(1/w) = w? is analytic at w = 0, and f(z) has a second- 


order zero at ©. The function f(z) = 23 is singular at © and has a third-order pole there since the function 
g(w) = f(1/w) = 1/ w® has such a pole at w = 0. The function e* has an essential singularity at 2 since ew 
has such a singularity at w = 0. Similarly, cos z and sin z have an essential singularity at 7. 

Recall that an entire function is one that is analytic everywhere in the (finite) complex plane. Liouville’s 
theorem (Sec. 14.4) tells us that the only bounded entire functions are the constants, hence any nonconstant 
entire function must be unbounded. Hence it has a singularity at ©, a pole if it is a polynomial or an essential 
singularity if it is not. The functions just considered are typical in this respect. 
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An analytic function whose only singularities in the finite plane are poles is called a meromorphic function. 
Examples are rational functions with nonconstant denominator, tan z, cot z, sec z, and csc z. =| 


In this section we used Laurent series for investigating singularities. In the next section 
we shall use these series for an elegant integration method. 


PROBLEEM—SET 16-2 


1-10 | ZEROS 
Determine the location and order of the zeros. 
1. sin* $z 2. (z* — 81)8 
3. (c + 8114 4. tan? 2z 
5. 27 sin? az 6. cosh* z 
7. 24+ (1 - 822 - 81 
8. (sinz — 1)3 
9. sin 2z cos 2z 


» (2? = 8)°(exp (2) — 1) 
. Zeros. If f(z) is analytic and has a zero of order n at 


Z = Zo, show that f ?(z) has a zero of order 2n at Zo. 


. TEAM PROJECT. Zeros. (a) Derivative. Show that 


if f(z) has a zero of order n > 1 at z = Zo, then f "(z) 
has a zero of order n — | at Zo. 

(b) Poles and zeros. Prove Theorem 4. 

(c) Isolated k-points. Show that the points at which 
a nonconstant analytic function f(z) has a given value 
k are isolated. 

(d) Identical functions. If f;(z) and f(z) are analytic 
in a domain D and equal at a sequence of points z, in 
D that converges in D, show that f(z) = fo(z) in D. 


13-22 


SINGULARITIES 


Determine the location of the singularities, including those 
at infinity. For poles also state the order. Give reasons. 


B 1 zz ztl 
“(+2 z-i «i 
2 
14. e+ + — - a 
z-i @-)d 
15. z exp (1/(¢ — 1 — i)”) 16. tan az 
1 
17. cot*z 18. <2 exp (+) 
oe 
19. 1/(e* — e”*) 20. 1/(cos z — sin z) 
21. eV" /(e* — 1) 22. (¢ — mw)! sin z 


23. 


24. 


25. 


Essential singularity. Discuss e” * ina similar way as 
e'/* is discussed in Example 3 of the text. 

Poles. Verify Theorem | for f(z) = z~3 — <7}. Prove 
Theorem 1. 


Riemann sphere. Assuming that we let the image of 
the x-axis be the meridians 0° and 180°, describe and 
sketch (or graph) the images of the following regions 
on the Riemann sphere: (a) |z| > 100, (b) the lower 
half-plane, (c) 3 = |-| =2. 


16.3 Residue Integration Method 


We now cover a second method of evaluating complex integrals. Recall that we solved 
complex integrals directly by Cauchy’s integral formula in Sec. 14.3. In Chapter 15 we 
learned about power series and especially Taylor series. We generalized Taylor series to 
Laurent series (Sec. 16.1) and investigated singularities and zeroes of various functions 
(Sec. 16.2). Our hard work has paid off and we see how much of the theoretical groundwork 
comes together in evaluating complex integrals by the residue method. 

The purpose of Cauchy’s residue integration method is the evaluation of integrals 


; S (2) dz 


Cc 


taken around a simple closed path C. The idea is as follows. 
If f(z) is analytic everywhere on C and inside C, such an integral is zero by Cauchy’s 
integral theorem (Sec. 14.2), and we are done. 


720 


EXAMPLE 1 


EXAMPLE 2 
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The situation changes if f(z) has a singularity at a point z = Zo inside C but is otherwise 
analytic on C and inside C as before. Then f(z) has a Laurent series 


© b 
FO = LY ane — 20)" + —— 4 2 


T 
2 
Zo. (Z — Zo) 


n=0 


that converges for all points near z = zo (except at z = Zo itself), in some domain of the 
form0 < |z — Zol < R (sometimes called a deleted neighborhood, an old-fashioned term 
that we shall not use). Now comes the key idea. The coefficient b, of the first negative 
power 1/(z — Zo) of this Laurent series is given by the integral formula (2) in Sec. 16.1 
with n = 1, namely, 


hee ; foe. 


2771 
Cc 


Now, since we can obtain Laurent series by various methods, without using the integral 
formulas for the coefficients (see the examples in Sec. 16.1), we can find b, by one of 
those methods and then use the formula for b; for evaluating the integral, that is, 


(1) ; f(2) dz = 27 iby. 


Cc 


Here we integrate counterclockwise around a simple closed path C that contains z = zo 
in its interior (but no other singular points of f(z) on or inside C!). 
The coefficient b; is called the residue of f(z) at z = zo and we denote it by 


(2) by = Res f(2). 
2=25 
Evaluation of an Integral by Means of a Residue 


Integrate the function f(z) = z ‘sin z counterclockwise around the unit circle C. 


Solution. From (14) in Sec. 15.4 we obtain the Laurent series 


sinz_ 1 I fe 
i zs Ble a) ie 
which converges for |z| > 0 (that is, for all z # 0). This series shows that f(z) has a pole of third order at z = 0 
and the residue by = —2). From (1) we thus obtain the answer 
; Sn de = 2ariby He | 
ac Zz 3 


CAUTION! Use the Right Laurent Series! 
Integrate f(z) = 1/(23 = zy) clockwise around the circle C: |z| = 3, 


Solution. 2 - A= 21 — z) shows that f(z) is singular at z = 0 and z = 1. Now z = 1 lies outside C. 
Hence it is of no interest here. So we need the residue of f(z) at 0. We find it from the Laurent series that 
converges for 0 < |z| < 1. This is series (I) in Example 4, Sec. 16.1, 


b—-+1]+z¢4+ 0: (0 < |z| < 1). 
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PROOF 


We see from it that this residue is 1. Clockwise integration thus yields 


dz 
= —277i Res f(z) = —27i. 
egnz a 


CAUTION! Had we used the wrong series (II) in Example 4, Sec. 16.1, 


1 1 1 1 
3 4 4 5 6 


BS. z 


(lal > D, 
we would have obtained the wrong answer, 0, because this series has no power 1/z. ia 


Formulas for Residues 


To calculate a residue at a pole, we need not produce a whole Laurent series, but, more 
economically, we can derive formulas for residues once and for all. 


Simple Poles at zo. A first formula for the residue at a simple pole is 
(3) Res f@=h = jim (z — Zo) f(z). (Proof below). 
A second formula for the residue at a simple pole is 


P(Z) ~— p(Zo) 
4) BO ee RE ae areal 


(Proof below). 


In (4) we assume that f(z) = p(z)/q(z) with p(zo) # 0 and g(z) has a simple zero at zo, 
so that f(z) has a simple pole at zg by Theorem 4 in Sec. 16.2. 


We prove (3). For a simple pole at z = zo the Laurent series (1), Sec. 16.1, is 


b 
FQ) = az t ao + aye — 20) + ale ~ 20) +>  < |z - zol < R). 


Here b, # 0. (Why?) Multiplying both sides by z — zg and then letting z — Zo, we obtain 
the formula (3): 


jim @ — zo)f@) = by + jim (z — zo)l4o + a1 — Zo) + 1 = by 


where the last equality follows from continuity (Theorem 1, Sec. 15.3). 
We prove (4). The Taylor series of g(z) at a simple zero Z 9 is 


(z — zo)” 


5, 4 0) + ° 


q(z) = (z — z0)q'(zo) + 


Substituting this into f = p/q and then f into (3) gives 


0 oes (z ~ zo)p) 
q(z) 20 (z — zo)lq'(zo) + ( — 20)¢"(Zo)/2 + sea] 


Res f(z) = lim (z — zo) 
Z=Zo 2% 


Z — Zg cancels. By continuity, the limit of the denominator is q'(z o) and (4) follows. 
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EXAMPLE 3_ Residue at a Simple Pole 


f(2) = (9z + d/ (2? + z) has a simple pole at i because 2 +1 = (2+ iz — A, and (3) gives the residue 


92 +i , 92 +i Oz +i 10i 
Res —5 lim (z — i) - - | - Si. 
zi 2(z~° + 1) 2% az + iz — i) az + i) Ie=4 —2 
By (4) with p(i) = 91 + iand q'(z) = 3z7 + 1 we confirm the result, 
Oz +i 9z +i 10i 
Res —5 | 3 Si. a 
z=i xz" + 1) 327 ede 2 


Poles of Any Order at Zo. The residue of f(z) at an mth-order pole at zo is 


— { aes mn 
(5) Res f(z) = lim c me co @| : 
a = I es, 


Z=Zo ot 


In particular, for a second-order pole (m = 2), 


(5*) Resp) = ia zo) f@)I }- 


2=20 


PROOF We prove (5). The Laurent series of f(z) converging near zo (except at Z9 itself) is (Sec. 16.2) 


bm Dm—1 
f@ = i Se te 


G-2" @— 2)" Z— 20 


+ dg + a — Zo) + °° 


where b,, # 0. The residue wanted is b,. Multiplying both sides by (z — zo)” gives 
2 = 20) $O) = bat aie = Boye Ho Peso) E25) 


We see that b, is now the coefficient of the power (z — zo)” tof the power series of 
2(z) = (z — Zo)'"f(z). Hence Taylor’s theorem (Sec. 15.4) gives (5): 


ee 
~ (m— 1)! 


1 Gel 
~ (m = 1)! dz} Lz Zo) f(2)].- ite 


by gure 0) 


EXAMPLE 4_ Residue at a Pole of Higher Order 


F(z) 502/(<3 + 222 —7z+ 4) has a pole of second order at z= 1 because the denominator equals 
(z+ 4\(z - 1? (verify!). From (5*) we obtain the residue 


8. BH 


50z ) 200 


d d 
Res f lim 1? f lim ( 
Res f(2) = lim ee — D°f@I = tim — (= 
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THEOREM 1 


PROOF 


Several Singularities Inside the Contour. 
Residue Theorem 


Residue integration can be extended from the case of a single singularity to the case of 
several singularities within the contour C. This is the purpose of the residue theorem. The 
extension is surprisingly simple. 


Residue Theorem 


Let f(z) be analytic inside a simple closed path C and on C, except for finitely many 


singular points Z1, Z2,°** , Z~ inside C. Then the integral of f(z) taken counterclockwise 
around C equals 27ri times the sum of the residues of f(z) at Z4,°**. Zk! 
k 
(6) $10 dz = Qari >, Res f(z). 
2=2; 
Cc j=l 2 


We enclose each of the singular points z; in a circle C; with radius small enough that 
those k circles and C are all separated (Fig. 373 where k = 3). Then f(z) is analytic in the 
multiply connected domain D bounded by C and Cj,---, C;, and on the entire boundary 
of D. From Cauchy’s integral theorem we thus have 


fle) de + > f(@) dz rors + } Me) de= 0, 
C 


Cz Ik 


(7) ; faz + ; 


Cc Cy 


the integral along C being taken counterclockwise and the other integrals clockwise (as in 
Figs. 354 and 355, Sec. 14.2). We take the integrals over C,,---,C, to the right and 
compensate the resulting minus sign by reversing the sense of integration. Thus, 


(8) ; f(@ dz = ; (2) dz + ; f(z) dz + +: +f f(z) dz 


Cc C1 Cy Cr 


where all the integrals are now taken counterclockwise. By (1) and (2), 


; f(z) dz = 277i Res f(z), J Agee, 
C. 2=2j 
so that (8) gives (6) and the residue theorem is proved. 33] 


Fig. 373. Residue theorem 
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This important theorem has various applications in connection with complex and real integrals. 
Let us first consider some complex integrals. (Real integrals follow in the next section.) 
EXAMPLE 5 __ Integration by the Residue Theorem. Several Contours 


Evaluate the following integral counterclockwise around any simple closed path such that (a) 0 and | are inside 
C, (b) 0 is inside, | outside, (c) 1 is inside, 0 outside, (d) 0 and 1 are outside. 


Solution. The integrand has simple poles at 0 and 1, with residues [by (3)] 


4 — 3z 4 — 3z 4— 3z 4 — 3z 
| 4, Res | =1. 
z=0 z 


Ri 
m0 2az—- 1) gs] z=1 2z— 1) z 


z 


[Confirm this by (4).] Answer: (a) 27ri(—4 + 1) 677i, (b) —877i, (c) 277i, (d) 0. 3] 


EXAMPLE 6 Another Application of the Residue Theorem 
Integrate (tan z)/ (22 — 1) counterclockwise around the circle C: |z| = 3. 


Solution. tan z is not analytic at +77/2, +377/2,---, but all these points lie outside the contour C. Because 
of the denominator z2 — 1 = (z — 1)(z + 1) the given function has simple poles at +1. We thus obtain from 
(4) and the residue theorem 


; tan z r 4 ( tan z tan z ) 
Z = 277i Ss + Res 
cen l . zt 724] 9 zal p2@— |] 


tan z 
277i 
2z z=-1 


= 27itan 1 = 9.7855i. B 


tan z 
f 
+ 


z=1 2z 


EXAMPLE 7 Poles and Essential Singularities 


Evaluate the following integral, where C is the ellipse Ox? + y? = 9 (counterclockwise, sketch it). 


ze 
; ( - + cer) dz. 
c \z* = 16 


Solution. Since z+ — 16 = Oat *2i and ~2, the first term of the integrand has simple poles at +2i inside 
C, with residues [by (4); note that er = 1) 


“ ze ze 1 
es 
z=-21 z* — 16 | 423 = 16 


and simple poles at +2, which lie outside C, so that they are of no interest here. The second term of the integrand 
has an essential singularity at 0, with residue a/ 2 as obtained from 


1 /z fete 3k il \ a \ f f rd f 
ze zi 1c — + t 313 fr zt+a74 r eee (|| = 1). 
a sw & 


Answer: 27ri(— 7@ — T6 + 3a) awa 4) 30.2217 by the residue theorem. | 
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PROBLEM SET 16-3 


1. Verify the calculations in Example 3 and find the other Pr 
residues: 16. /* dz, C: the unit circle 
2. Verify the calculations in Example 4 and find the other e 
residue. W e k. C:| /2| = 4.5 
RESIDUES "Yi coset Cle mil 
Find all the singularities in the finite plane and the zt+1 
corresponding residues. Show the details. 18. ; 458 dz, C:|z—-1| = 
Z — 2z 
3 sin 2z i COS Z . 
- ‘ sinh z 
a z 19 $ de C:|z - 2i| =2 
5. : 6. tan z . 
1+2 dz 
7 20. C= 13 
7. cot 17z 8 @_ py B (2? +4 1° 
4 
1 Zz 7 
9. - 10. ————_ 21; i ent az, C:|e| =} 
Pe 2 iz+2 Cc 
& Va 
1,—— 12, 2/o-8 
our 22. $= 2. = de C the unit circle 
13. CAS PROJECT. Residue at a Pole. Write a program c 42 


for calculating the residue at a pole of any order in the 


finite plane. Use it for solving Probs. 5-10. 23. 


BOP = 2324 5. _ 
Iz, C the unit circle 
Cc 


2a 1)” (3z:— 1) 


14-25; RESIDUE INTEGRATION 


Evaluate (counterclockwise). Show the details. exp (—2?) 

4, > z, C:|z) = 1.5 

g-= 23 ; sin 4z 
14. —————— ik, Ce 2 1 = 32 ig 
(oe on 4z—5 
z cosh 77z 

25. ; 3 ; iz, |zl = 7 

15. ; tan 2mzdz, C:|z — 0.2] = 0.2 gz + 152° + 36 


Cc 


16.4 Residue Integration of Real Integrals 


Surprisingly, residue integration can also be used to evaluate certain classes of complicated 
real integrals. This shows an advantage of complex analysis over real analysis or calculus. 


Integrals of Rational Functions of cos @ and sin 0 


We first consider integrals of the type 


27 
(1) J= | F (cos 6, sin 9) dO 
0 
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EXAMPLE 1 


CHAP. 16 Laurent Series. Residue Integration 


where F(cos 6, sin 6) is a real rational function of cos @ and sin @ [for example, (sin? 0)/ 


(5 — 4 cos @)] and is finite (does not become infinite) on the interval of integration. Setting 


e” = z, we obtain 


1 0 =e i( “) 
== (e+ =. = 
cos 6 7 Os 5 ie = 
pa . il 1 
sin@ = —(e” — e) = (2 = ) 
21 


Since F is rational in cos @ and sin 0, Eq. (2) shows that F is now a rational function of 
z, say, f(z). Since dz/d0 = ie’, we have d@ = dz/iz and the given integral takes the form 


(2) 


dz 
(3) J= ; Oz 
C Ke 


and, as @ ranges from 0 to 277 in (1), the variable z = en ranges counterclockwise once 
around the unit circle |z| = 1. (Review Sec. 13.5 if necessary.) 


An Integral of the Type (1) 


> do 
Show by the present method that { 277. 


0 2 — cos 0 


Solution. We use cos 6 = 3(z + 1/z) and d@ = dz/iz. Then the integral becomes 


1 7 ; ~ dz 


V2 


4 dz 
ide (@- V2— Diz - V2 + 1) 


We see that the integrand has a simple pole at z} = V2 + 1 outside the unit circle C, so that it is of no interest 
here, and another simple pole at z2 = V2 — 1 (where z — V2 + 1 = 0) inside C with residue [by (3), Sec. 16.3] 


Res : | : 
= (g— V2—-Dze-V2+1)) Lz-V2-1),-\e-4 


Answer: 2ri(—2/i)(—4) = 2a. (Here —2/i is the factor in front of the last integral.) fi] 


As another large class, let us consider real integrals of the form 


(4) | fx) dex. 


Such an integral, whose interval of integration is not finite is called an improper integral, 


and it has the meaning 


e 0 b 
(5’) | fe) dx = lim, | fx) dx + Jim | f(x) dx. 
0 


0 a 
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If both limits exist, we may couple the two independent passages to —% and ©, and write 
R 


(5) | SQ) dx = jim, | F(x) dx. 
R 


—0 = 


The limit in (5) is called the Cauchy principal value of the integral. It is written 


pr. | f(x) dx. 


—2 


It may exist even if the limits in (5’) do not. Example: 


lim | xdx = lim ( ) 0, but lim | xdx = ©, 
Roo Le Roo 2 2 be 


We assume that the function f(x) in (4) is a real rational function whose denominator 
is different from zero for all real x and is of degree at least two units higher than the 
degree of the numerator. Then the limits in (5’) exist, and we may start from (5). We 
consider the corresponding contour integral 


(S*) ; f(z) dz 

Cc 
around a path C in Fig. 374. Since f(x) is rational, f(z) has finitely many poles in the 
upper half-plane, and if we choose R large enough, then C encloses all these poles. By 
the residue theorem we then obtain 
R 


; f(z) dz = | f(z) dz + | f(x) dx = 277i > Res f(z) 
R 


Cc Ss = 


where the sum consists of all the residues of f(z) at the points in the upper half-plane at 
which f(z) has a pole. From this we have 


R 
(6) | f(x) dx = 2ai D) Res f(z) — | f(@ dz. 


-R Ss 


We prove that, if R—> °°, the value of the integral over the semicircle S approaches 
zero. If we set z = Re’, then S is represented by R = const, and as z ranges along S, the 
variable 0 ranges from 0 to 77. Since, by assumption, the degree of the denominator of 
f(z) is at least two units higher than the degree of the numerator, we have 


k 


2 (lz| = R > Ro) 
z 


If@I < 


—R R x 


Fig. 374. Path C of the contour integral in (5*) 
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EXAMPLE 2 


CHAP. 16 Laurent Series. Residue Integration 
for sufficiently large constants k and Rg. By the ML-inequality in Sec. 14.1, 


kar 
R 


(R > Ro). 


| FS (2) dz 


es TR = 
2 
s R 


Hence, as R approaches infinity, the value of the integral over S approaches zero, and (5) 
and (6) yield the result 


(7) | fey ar = ami DResfto 
where we sum over all the residues of f(z) at the poles of f(z) in the upper half-plane. 


An Improper Integral from 0 to 2 


Using (7), show that 


H 


Fig. 375. Example 2 


Solution. Indeed, f(z) = 1 /A+ z4) has four simple poles at the points (make a sketch) 


z= emt Zo = ebr/4 z3 = ears Za = eT Ty4, 


The first two of these poles lie in the upper half-plane (Fig. 375). From (4) in the last section we find the residues 


1 1 1 ; i ee 
Res f | AG | 5 e 8Ty/4 eml/*. 
Z=2 (Lez): Z=21 4z Z=2 4 4 


1 1 1 . : 
Res f@ | | eT 87/4 = eT Ty 
etn (at+z)' |, L4l-., 4 4 


(Here we used e”™ = —1 and e27' = 1.) By (1) in Sec. 13.6 and (7) in this section, 
* dx 277i ; . 277i 7 7 
(e774 — e~7/4) - 2i sin 7 sin : 
f 1+ x4 4 4 4 4 V2 
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EXAMPLE 3 


Since 1/(1 + x4) is an even function, we thus obtain, as asserted, 


I dx | dx 7 w 
ltxt 2) ,14+x4% 2V2 


0 = 


Fourier Integrals 


The method of evaluating (4) by creating a closed contour (Fig. 374) and “blowing it up” 
extends to integrals 


(8) | F(x) cos sx dx and | F(x) sin sx dx (s real) 


-—2 —2% 


as they occur in connection with the Fourier integral (Sec. 11.7). 
If f(x) is a rational function satisfying the assumption on the degree as for (4), we may 
consider the corresponding integral 


; foe dz (s real and positive) 
Cc 


over the contour C in Fig. 374. Instead of (7) we now get 


(9) | f(xe* dx = 2ai S'Res [f(2e**] (s > 0) 


where we sum the residues of foe at its poles in the upper half-plane. Equating the 
real and the imaginary parts on both sides of (9), we have 


| f(x) cos sx dx = —27 >\Im Res Lf@e*], 


—07 


| f(x) sin sx dx = 277 >Re Res [f(2e**]. 


—2 


(10) (s > 0) 


To establish (9), we must show [as for (4)] that the value of the integral over the 
semicircle S$ in Fig. 374 approaches 0 as R— ©. Now s > 0 and S lies in the upper half- 
plane y = 0. Hence 


le%?| = [eOt| = [ee] = 1-e 4% =1 (s>0, y2=0). 


From this we obtain the inequality | f(z)e’**| = | f(@|le**| S |f(|_ (s > 0, y = 0). This 
reduces our present problem to that for (4). Continuing as before gives (9) and (10). ai) 


An Application of (10) 


* cos sx Te 5s * sin sx 
Show that dx =—e™, dx =0 (s > 0,k > 0). 
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Solution. In fact, e?/ (k2 + 2?) has only one pole in the upper half-plane, namely, a simple pole at z = ik, 
and from (4) in Sec. 16.3 we obtain 


‘ eis% eit eT ks 
eik kA? +22 [22 | Dik 
Thus 
oo eis e ks aT 
| a 3g dx = Qi es 
Lok Te 2ik k 
Since e*” = cos sx + isin sx, this yields the above results [see also (15) in Sec. 11.7.] te 


Another Kind of Improper Integral 
We consider an improper integral 


B 


(1) | F(x) dx 
A 


whose integrand becomes infinite at a point a in the interval of integration, 
lim |f(x)| = ~. 
lim || 


By definition, this integral (11) means 


B a-e B 
(12) | fx) dx = lim | Fx) dx + lim | fo dx 
+7 


A A a 


where both € and 7 approach zero independently and through positive values. It may 
happen that neither of these two limits exists if € and 7 go to 0 independently, but the 
limit 

a-e 


B 
| f@) dx + | SQ) as| 


A ate 


03) ig | 


exists. This is called the Cauchy principal value of the integral. It is written 


B 


pr. v. | f(x) dx. 


A 


For example, 


the principal value exists, although the integral itself has no meaning. 
In the case of simple poles on the real axis we shall obtain a formula for the principal 
value of an integral from —% to ». This formula will result from the following theorem. 
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THEOREM -1 


PROOF 


Simple Poles on the Real Axis 


If f(z) has a simple pole at z = a on the real axis, then (Fig. 376) 


lim | f(z) dz = TiRes f(z). 
Co 2=a 


zx 
a-r a atr X« 


Fig. 376. Theorem 1 


By the definition of a simple pole (Sec. 16.2) the integrand f(z) has for 0 < |z — al < R 
the Laurent series 


b 
fe)=zoqgts@, by = Res fF). 


Here g(z) is analytic on the semicircle of integration (Fig. 376) 


Co: z=atre®, OS057T7 


and for all z between Cz and the x-axis, and thus bounded on Cy, say, |g(z)| S M. By 


integration, 


7b : 
| 0 dz = | tire ao + | 2(z) a= pyri | 2(z) dz. 
Cy o re C. Cc. 


2 2 


The second integral on the right cannot exceed Marr in absolute value, by the 
ML-inequality (Sec. 14.1), and ML = Mirr—0 as r— 0. oH 


Figure 377 shows the idea of applying Theorem | to obtain the principal value of the 
integral of a rational function f(x) from —™ to ©. For sufficiently large R the integral over 
the entire contour in Fig. 377 has the value J given by 277i times the sum of the residues 
of f(z) at the singularities in the upper half-plane. We assume that f(x) satisfies the degree 
condition imposed in connection with (4). Then the value of the integral over the large 


—-R a-r a a+rR 


Fig. 377. Application of Theorem 1 


732 


EXAMPLE 4 


CHAP. 16 Laurent Series. Residue Integration 


semicircle S approaches 0 as R->~. For r—0O the integral over Cg (clockwise!) 
approaches the value 


K = —7iRes f(z) 
z=a 


by Theorem |. Together this shows that the principal value P of the integral from —% to 
co plus K equals J; hence P = J — K = J + 7i Res,—, f(z). If f(z) has several simple 
poles on the real axis, then K will be —7ri times the sum of the corresponding residues. 
Hence the desired formula is 


(14) pr. v. | f(x) dx = 277i D'Res f(z) + mi >) Res f(z) 


—2 


where the first sum extends over all poles in the upper half-plane and the second over all 
poles on the real axis, the latter being simple by assumption. 


Poles on the Real Axis 


Find the principal value 


f, dx 
pr. v. . 
-w (47 — 3x + 2x? + 1) 


Solution. Since 


x2 -— 3x+2=(xe- Da-2), 


the integrand f(x), considered for complex z, has simple poles at 


z=1, Res f= |__| 
a1 (z — 22 + 1) dena 


3? 
; 1 
z=2, Res f(z) = a a | 
ae (z — 1)(z" + 1) Jz-2 
a 
5° 
, 1 
2= 14, Res f(z) | ] 
a (@? — 32 + DE +d Jeni 
-_ 1 381 
6 + 2i 20 
and at z = —i in the lower half-plane, which is of no interest here. From (14) we get the answer 
7 d. cee} 11 
prv. | a x 5 2mi( *) + mi ( + ) aay | 
gg OO — 3x + 2)Q" +1) 20 2 35 10 


More integrals of the kind considered in this section are included in the problem set. Try 
also your CAS, which may sometimes give you false results on complex integrals. 
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PROBLEM SET 16-4 


1-9 | INTEGRALS INVOLVING COSINE AND SINE f sin x 


Evaluate the following integrals and show the details of (x — W(x? + hs 
your work. 
i a dx 
2 d0 dO 22. | 2_, 
1. ——_—— 2. Sete =» XX 
i k — cos 0 : 7 + 3cos0 
20 ¢ 
ep | 1 + sin @ 40 af 1+ 4cos 0 9 23-26} IMPROPER INTEGRALS: 
» 37+ cosé 4 17 — 8 cos 6 POLES ON THE REAL AXIS 
: fr eos? 6 i . ———— ai? B A Find the Cauchy principal value (showing details): 
, 5 — 4cos 0 5 — 4cos 6 * dx 7 dx 
0 23. r 24. or car 
Q0 Qa oe aime | -ok +3x"—4 
Ts | 8. | ——..—. d0 
cae! oat eet i 7 
0 0 25. | nae 26. | ax 
‘ = cos 6 9 2X ~ * -2r 1 
i 13 — 12 cos 20 27. CAS EXPERIMENT. Simple Poles on the Real 
Axis. Experiment with integrals [™, f(x) dx, 
10-22} IMPROPER INTEGRALS: f(x) = [@ — a(x — ag) (@ — a), a; real and 
INFINITE INTERVAL OF INTEGRATION all different, k > 1. Conjecture that the principal value 


of these integrals is 0. Try to prove this for a special 


Evaluate the following integrals and show details of your 
k, say, k = 3. For general k. 


work. 
0 a Ps d 28. TEAM PROJECT. Comments on Real Integrals. 
10. | —— 11. | = a5 (a) Formula (10) follows from (9). Give the details. 
-= (l +x") -= (+ x") (b) Use of auxiliary results. Integrating e~* around 
ia dx + ee the boundary C of the rectangle with vertices —a, a, 
‘ et — 2x + 5% _ 6 x2 + (x2 + 4) a + ib, —a + ib, letting a— ©, and using 
2 “ 2 Vir 
+ Ne! a 
14. | we is. [ = dx [ a= 
eg A gis: el 0 
7 Ps how th 
16 | cos 2x ie ‘a | sin 3x ay pane mae 
: 2 2 . 4 - ViT _y> 
-« (*" + I) -ax +1 | e~” cos 2bx dx = aoe ; 
is. [ — a vw. { ° 
: xt + 5x2 44 a . = gta (This integral is needed in heat conduction in Sec. 
i 12.7.) 
20. i a (c) Inspection. Solve Probs. 13 and 17 without 
g = x? calculation. 


CHAPTER-16-REVIEW-QUESTIONS AND PROBLEMS 


1. What is a Laurent series? Its principal part? Its use? 4. Can the residue at a singularity be zero? At a simple 


Give simple examples. pole? Give reason. 
2. What kind of singularities did we discuss? Give defi- 5. State the residue theorem and the idea of its proof from 
nitions and examples. memory. 


3. What is the residue? Its role in integration? Explain 6. How did we evaluate real integrals by residue integration? 
methods to obtain it. How did we obtain the closed paths needed? 
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7. What are improper integrals? Their principal value? 16 15z+ 9 


Why did they occur in this chapter? : f= oe C: || = 4 
8. What do you know about zeros of analytic functions? cos z 

Give examples. 17,.—_, n=0,1,2,---, C: |) =1 
9. What is the extended complex plane? The Riemann “ 

sphere R? Sketch z= 1 +ionR. 18. cot4z, C:|zl =% 


10. What is an entire function? Can it be analytic at 


infinity? Explain the definitions. 19-25 | REAL INTEGRALS 


Evaluate by the methods of this chapter. Show details. 


11-18 | COMPLEX INTEGRALS 


20 
d6 i 
Integrate counterclockwise around C. Show the details. 19. | = =e 20. | ee 
: ) 13 —5siné ) 3+ cosé 
sin 3z 
a a ee 21 — 
12. e?/*, C:|z-—1-i| =2 5 34 — 16 sind 
523 ee a 7 
13. Ae 2. | a | 
3 2+4 Clz|=3 1444 ate 
= “dx * cos x 
14. , C:|z—- il = mi/2 24, | a 95. | : 
o+4 ae x” — 4ix x7 +1 
2537 
15. te te ae 
(Z — 5) 


SUMMARY—OF CHAPTER 16 


Laurent Series. Residue Integration 


A Laurent series is a series of the form 


wo foo b 

(1) {@ = ¥ ake - 2)" + > —* SS (Sec. 16.1) 
n=0 n=1 & — 20) 
or, more briefly written [but this means the same as (1)!] 
~ 1 Le) 
(1*) f@ = An(Z—- 20)", ar = ; dz* 
2. " "Qari Ie (@* — zo)" 

where n = 0, +1, £2,---. This series converges in an open annulus (ring) A with 


center zo. In A the function f(z) is analytic. At points not in A it may have 
singularities. The first series in (1) is a power series. In a given annulus, a Laurent 
series of f(z) is unique, but f(z) may have different Laurent series in different annuli 
with the same center. 

Of particular importance is the Laurent series (1) that converges in a neighborhood 
of zo except at zg itself, say, for 0 < |z — zo| < R(R > 0, suitable). The series 
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(or finite sum) of the negative powers in this Laurent series is called the principal 
part of f(z) at zo. The coefficient by of 1/(z — zo) in this series is called the residue 
of f(z) at Zo and is given by [see (1) and (1*)] 


1 
(2) by = Res f(z) = aan ; SF (z*) dz*. Thus p se" dz* = 277i Res f(z). 
22 TI C 6 Z=Zy 
by can be used for integration as shown in (2) because it can be found from 
m-1 


1 . (ad rs 
(3) Res f(z) = (m — DI jim ( L(z — Zo) veal), (Sec. 16.3), 


Z=% 


provided f(z) has at zg a pole of order m; by definition this means that principal 
part has 1/(z — zo)” as its highest negative power. Thus for a simple pole (m = 1), 


Res f(z) = lim (z — Zo)f(z);_— also, ~—— Res Pe) _ P i) 
z=2o 2% = q(z) F a) 


If the principal part is an infinite series, the singularity of f(z) at zo is called an 
essential singularity (Sec. 16.2). 

Section 16.2 also discusses the extended complex plane, that is, the complex plane 
with an improper point % (“infinity”) attached. 

Residue integration may also be used to evaluate certain classes of complicated 
real integrals (Sec. 16.4). 
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CHAPTER | 7 


Conformal Mapping 


Conformal mappings are invaluable to the engineer and physicist as an aid in solving 
problems in potential theory. They are a standard method for solving boundary value 
problems in two-dimensional potential theory and yield rich applications in electrostatics, 
heat flow, and fluid flow, as we shall see in Chapter 18. 

The main feature of conformal mappings is that they are angle-preserving (except at 
some critical points) and allow a geometric approach to complex analysis. More details 
are as follows. Consider a complex function w = f(z) defined in a domain D of the z—plane; 
then to each point in D there corresponds a point in the w-plane. In this way we obtain a 
mapping of D onto the range of values of f(z) in the w-plane. In Sec. 17.1 we show that 
if f(z) is an analytic function, then the mapping given by w = f(z) isa conformal mapping, 
that is, it preserves angles, except at points where the derivative f'(z) is zero. (Such points 
are called critical points.) 

Conformality appeared early in the history of construction of maps of the globe. 
Such maps can be either “conformal,” that is, give directions correctly, or “equiareal,” 
that is, give areas correctly except for a scale factor. However, the maps will always 
be distorted because they cannot have both properties, as can be proven, see [GenRef8] 
in App. 1. The designer of accurate maps then has to select which distortion to take 
into account. 

Our study of conformality is similar to the approach used in calculus where we study 
properties of real functions y = f(x) and graph them. Here we study the properties of conformal 
mappings (Secs. 17.1—-17.4) to get a deeper understanding of the properties of functions, most 
notably the ones discussed in Chap. 13. Chapter 17 ends with an introduction to Riemann 
surfaces, an ingenious geometric way of dealing with multivalued complex functions such as 
w = sqrt (z) and w = Inz. 

So far we have covered two main approaches to solving problems in complex analysis. 
The first one was solving complex integrals by Cauchy’s integral formula and was broadly 
covered by material in Chaps. 13 and 14. The second approach was to use Laurent series 
and solve complex integrals by residue integration in Chaps. 15 and 16. Now, in Chaps. 17 
and 18, we develop a third approach, that is, the geometric approach of conformal mapping 
to solve boundary value problems in complex analysis. 


Prerequisite: Chap. 13. 
Sections that may be omitted in a shorter course: 17.3 and 17.5. 
References and Answers to Problems: App. 1 Part D, App. 2. 
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17.1 Geometry of Analytic Functions: 
Conformal Mapping 


EXAMPLE 1 


We shall see that conformal mappings are those mappings that preserve angles, except at 
critical points, and that these mappings are defined by analytic functions. A critical point 
occurs wherever the derivative of such a function is zero. To arrive at these results, we 
have to define terms more precisely. 

A complex function 


(1) w = f(Z) = ua, y) + wie, y) =x + ty) 


of a complex variable z gives a mapping of its domain of definition D in the complex 
z-plane into the complex w-plane or onto its range of values in that plane.' For any point zg 
in D the point wo = f(zo) is called the image of zg with respect to f: More generally, for 
the points of a curve C in D the image points form the image of C; similarly for other 
point sets in D. Also, instead of the mapping by a function w = f(z) we shall say more 
briefly the mapping w = f(z). 


Mapping w = f(x) = z” 


Using polar forms z = re” and w = Re’, we have w = 2” = r7e”’, Comparing moduli and arguments gives 


R=r" and ¢ = 20. Hence circles r = rp are mapped onto circles R = r2 and rays 0 = @ onto rays db = 26. 
Figure 378 shows this for the region 1 S |z| S 3, 7/6 = 6 = 7/3, which is mapped onto the region 
1S |wl S$, 7/3 $0 S 27/3. 

In Cartesian coordinates we have z = x + iy and 


u = Re (22) = x72 - y?, v=Im (2?) = 2xy. 


Hence vertical lines x = c = const are mapped onto u = c? — y”, v = 2cy. From this we can eliminate y. We 
obtain y? = c* — wandv? = Ac?y?. Together, 


v2 = 4c2(c? — u) (Fig. 379). 


These parabolas open to the left. Similarly, horizontal lines y = k = const are mapped onto parabolas opening 
to the right, 
v7 = 4k2(k? + u) (Fig. 379). 


(z-plane) (w-plane) 


Fig. 378. Mapping w = z*. Lines |z| = const, arg z = const and their images in the w-plane 


The general terminology is as follows. A mapping of a set A into a set B is called surjective or a mapping of 
A onto B if every element of B is the image of at least one element of A. It is called injective or one-to-one if 
different elements of A have different images in B. Finally, it is called bijective if it is both surjective and injective. 
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THEOREM 1 


PROOF 


CHAP. 17. Conformal Mapping 


Fig. 379. Images of x = const, y = const under w = z? 


Conformal Mapping 


A mapping w = f(z) is called conformal if it preserves angles between oriented curves in 
magnitude as well as in sense. Figure 380 shows what this means. The angle a (0 = a S 77) 
between two intersecting curves C, and C2 is defined to be the angle between their oriented 
tangents at the intersection point zg. And conformality means that the images Cj and C3 
of Cy and Cz make the same angle as the curves themselves in both magnitude and direction. 


Conformality of Mapping by Analytic Functions 


The mapping w = f(z) by an analytic function f is conformal, except at critical 
points, that is, points at which the derivative f' is zero. 


w = 2” has acritical point at z = 0, where f(z) = 2z = 0 and the angles are doubled (see 
Fig. 378), so that conformality fails. 
The idea of proof is to consider a curve 


(2) C: z(t) = x(t) + iy(A) 
in the domain of f(z) and to show that w = f(z) rotates all tangents at a point zg (where 


f'(Zo) # 0) through the same angle. Now z(t) = dz/dt = x(t) + iy(t) is tangent to C in 
(2) because this is the limit of (zy — zo)/At (which has the direction of the secant z1 — zo 


f(z.) 


(z-plane) (w-plane) 


Fig. 380. Curves C, and C, and their respective images 
C7 and C3 under a conformal mapping w = f(z) 
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EXAMPLE 2 


EXAMPLE 3 


in Fig. 381) as zy approaches zg along C. The image C* of C is w = f(z(#)). By the chain 
rule, w = f (c(t))z(t). Hence the tangent direction of C* is given by the argument (use (9) 
in Sec. 13.2) 


(3) arg w = arg f’ + argz 


where arg Z gives the tangent direction of C. This shows that the mapping rotates all 
directions at a point zg in the domain of analyticity of f through the same angle arg f’(zo), 
which exists as long as f'(zg) # 0. But this means conformality, as Fig. 381 illustrates 
for an angle a between two curves, whose images Cj and C5 make the same angle (because 
of the rotation). a 


z= ACN + At) 


Tangent 


Fig. 381. Secant and tangent of the curve C 


In the remainder of this section and in the next ones we shall consider various conformal 
mappings that are of practical interest, for instance, in modeling potential problems. 


Conformality of w = z” 


The mapping w = z”,n = 2,3,---, is conformal, except at z = 0, where w nz} 0. For n = 2 this is 
shown in Fig. 378; we see that at 0 the angles are doubled. For general n the angles at 0 are multiplied by a 
factor n under the mapping. Hence the sector 0 = @ S 7/n is mapped by z” onto the upper half-plane v 2 0 


(Fig. 382). B 


xe u 


Fig. 382. Mapping by w = z” 


Mapping w = z + 1/z. Joukowski Airfoil 
In terms of polar coordinates this mapping is 


1 
w=u+t iv =r(cos6@ + isin@) +4 7 (cos 6 isin 6). 


By separating the real and imaginary parts we thus obtain 


u = acos@, vu =b sing where a=rt+-—, b=r 


Hence circles |z] = r = const # 1 are mapped onto ellipses x7/a? + y?/b? = 1. The circle r = | is mapped 
onto the segment —2 = u S 2 of the w-axis. See Fig. 383. 
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Fig. 383. Example 3 


Now the derivative of w is 


1 @+DE-D) 


2 2 
z z 


which is 0 at z = +1. These are the points at which the mapping is not conformal. The two circles in Fig. 384 
pass through z = —1. The larger is mapped onto a Joukowski airfoil. The dashed circle passes through both — 1 


and | and is mapped onto a curved segment. 
Another interesting application of w = z + 1/z (the flow around a cylinder) will be considered in Sec. 18.4. Ml 


Fig. 384. Joukowski airfoil 


EXAMPLE 4 _ Conformality of w = e” 


From (10) in Sec. 13.5 we have |e*| = e” and Arg z = y. Hence e* maps a vertical straight line x = x9 = const 
onto the circle |w| = e*° and a horizontal straight line y = yo = const onto the ray arg w = yo. The rectangle 
in Fig. 385 is mapped onto a region bounded by circles and rays as shown. 

The fundamental region —7r < Arg z = 77 of e* in the z-plane is mapped bijectively and conformally onto 
the entire w-plane without the origin w = 0 (because e* = 0 for no z). Figure 386 shows that the upper half 
0 <y=7 of the fundamental region is mapped onto the upper half-plane 0 < arg w S 77, the left half being 
mapped inside the unit disk |w| S 1 and the right half outside (why?). ia] 


VU 


0 1 x -3 2 -l 


Fig. 385. Mapping by w = e7 


) x -1 0) 1 u 
(z-plane) (w-plane) 


Fig. 386. Mapping by w = e” 
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Principle of Inverse Mapping. Mapping w = Lnz 


Principle. The mapping by the inverse z = f~*(w) of w = f(z) is obtained by interchanging the roles of the 
z-plane and the w-plane in the mapping by w = f(z). 

Now the principal value w = f(z) = Lnz of the natural logarithm has the inverse z = f~1(w) = e”. From 
Example 4 (with the notations z and w interchanged!) we know that f~1(w) = e” maps the fundamental region 
of the exponential function onto the z-plane without z = 0 (because e” # 0 for every w). Hence w = f(z) = Lnz 
maps the z-plane without the origin and cut along the negative real axis (where 6 = Im Ln z jumps by 277) 
conformally onto the horizontal strip —77 < v S 7 of the w-plane, where w = u + iv. 

Since the mapping w = Lnz + 277 differs from w = Lnz by the translation 277i (vertically upward), this 
function maps the z-plane (cut as before and 0 omitted) onto the strip 7 < vu S 37. Similarly for each of the 
infinitely many mappings w = In z = Lnz + 2n7i (n = O, 1, 2,---). The corresponding horizontal strips of width 
277 (images of the z-plane under these mappings) together cover the whole w-plane without overlapping. B 


Magnification Ratio. By the definition of the derivative we have 


(4) i S(2 7 f(Zo) 
rG Z0 


2% 


= lf'Zo)l. 


Therefore, the mapping w = f(z) magnifies (or shortens) the lengths of short lines by 
approximately the factor |f’(zo)|. The image of a small figure conforms to the original 
figure in the sense that it has approximately the same shape. However, since f'(z) varies 
from point to point, a /arge figure may have an image whose shape is quite different from 
that of the original figure. 

More on the Condition f’(z) # 0. From (4) in Sec. 13.4 and the Cauchy—Riemann 
equations we obtain 


2 2 2 
(5') fol? _ Ou x <u = *u) a (*) = du OU Ou OU 
Ox Ox Ox Ox ax dy dy ax 
that is, 
au au 
6) vers!” ? | 
dv av} Ax)" 
Ox oy 


This determinant is the so-called Jacobian (Sec. 10.3) of the transformation w = f(z) 
written in real form u = u(x, y), v = u(x, y). Hence f'(zo) # 0 implies that the Jacobian 
is not O at zg. This condition is sufficient that the mapping w = f(z) in a sufficiently small 
neighborhood of zo is one-to-one or injective (different points have different images). See 
Ref. [GenRef4] in App. 1. 


PROBLEM SET 17-1 


1. On Fig. 378. One “rectangle” and its image are colored. 
Identify the images for the other “rectangles.” 


2. On Example 1. Verify all calculations. 
3. Mapping w = z°. Draw an analog of Fig. 378 for 


ea ale 


4. Conformality. Why do the images of the straight lines 
x = const and y = const under a mapping by an 
analytic function intersect at right angles? Same 
question for the curves |z| = const and arg z = const. 
Are there exceptional points? 
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5. Experiment on w = z. Find out whether w = z pre- 
serves angles in size as well as in sense. Try to prove 
your result. 


MAPPING OF CURVES 


Find and sketch or graph the images of the given curves 
under the given mapping. 
6 8=1,2,3,4, y= 23,4 wer 
7. Rotation. Curves as in Prob. 6, w = iz 
8. Reflection in the unit circle. |z| = 5 3 1;2.,.3; 
Arg z = 0, £77/4, 77/2, 4377/2 
9. Translation. Curves as in Prob.6,w =z+2+i 
10. CAS EXPERIMENT. Orthogonal Nets. Graph the 
orthogonal net of the two families of level curves 
Re f(z) = const and Imf(z) = const, where (a) f(z) = an 
(b) f@ = 1/z, © f@ = 1/2", @ f@ = & + 0/ 
(1 + iz). Why do these curves generally intersect at 
right angles? In your work, experiment to get the best 
possible graphs. Also do the same for other functions 
of your own choice. Observe and record shortcomings 
of your CAS and means to overcome such deficiencies. 


11-20; MAPPING OF REGIONS 
Sketch or graph the given region and its image under the 
given mapping. 
11. |Z) $3, -7/8<Argz<7/8, w=2 


12. 1 < |z| <3, O0<Argz<7/2, w=23 
13.2 =Imz=5, w=iz 
14.x21, w=1/z 


15. lz-4| =, w= I/z 

16. |Z) <3, Imz>0, w=1/z 
17. -Ln2S3x5Ln4, w=e 
18 -1lex*=2, 


—7TW<y<7, w=e 


19.1<|z)<4, 7/4<0537/4, w=Lnz 
20.5 Sled $1, 050<7/2, w=Lnz 
21-26| FAILURE OF CONFORMALITY 


Find all points at which the mapping is not conformal. Give 
reason. 


21. A cubic polynomial 
22. 22 + 1/2? 
, 1 


- 
23. = 


427 +2 

24. exp (2° — 802) 

25. cosh z 

26. sin 77z 

27. Magnification of Angles. Let f(z) be analytic at zo. 
Suppose that f’(zo) = 0,---,f*~P (zo) = 0. Then the 
mapping w = f(z) magnifies angles with vertex at zo by 
a factor k. Illustrate this with examples for k = 2, 3, 4. 

28. Prove the statement in Prob. 27 for general k = 1, 
2,--+. Hint. Use the Taylor series. 


29-35 | MAGNIFICATION RATIO, JACOBIAN 
Find the magnification ratio M. Describe what it tells 
you about the mapping. Where is M = 1? Find the 
Jacobian J. 


29. w = 52" 
30. w= 23 
31. w = 1/z 
32, w= 1/2" 
33. w = e* 
zt+1 
34. w = —— 
35. w = Lnz 


17.2 Linear Fractional Transformations 
(Mobius Transformations) 


Conformal mappings can help in modeling and solving boundary value problems by first 
mapping regions conformally onto another. We shall explain this for standard regions 
(disks, half-planes, strips) in the next section. For this it is useful to know properties of 
special basic mappings. Accordingly, let us begin with the following very important class. 

The next two sections discuss linear fractional transformations. The reason for our 
thorough study is that such transformations are useful in modeling and solving boundary 
value problems, as we shall see in Chapter 18. The task is to get a good grasp of which 
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conformal mappings map certain regions conformally onto each other, such as, say 
mapping a disk onto a half-plane (Sec. 17.3) and so forth. Indeed, the first step in the 
modeling process of solving boundary value problems is to identify the correct conformal 
mapping that is related to the “geometry” of the boundary value problem. 

The following class of conformal mappings is very important. Linear fractional 
transformations (or Mébius transformations) are mappings 


az+b 
(1) ae ea (ad — bc # 0) 


where a, b, c, d are complex or real numbers. Differentiation gives 


a(cz + d) — claz +b) _ ad — be 
(cz + dy” (cz + dy 


(2) w= 


This motivates our requirement ad — bc # 0. It implies conformality for all z and excludes 
the totally uninteresting case w’ = 0 once and for all. Special cases of (1) are 


w=ztb (Translations) 
w=az with|la| =1 (Rotations) 

°) w=azt+b (Linear transformations) 
w=I1/z Unversion in the unit circle). 


Properties of the Inversion w = 1/z (Fig. 387) 


In polar forms z = re’ and w = Re’ the inversion w = 1/z is 


; 1 I ; 1 
Re? = a =e oe and gives R=-, ¢=—-86. 
re r r 
Hence the unit circle |z| = r = 1 is mapped onto the unit circle |w| = R = 1; w = e’? = e~ Fora general 


z the image w = 1/z can be found geometrically by marking |w| = R = 1/r on the segment from 0 to z and 
then reflecting the mark in the real axis. (Make a sketch.) 

Figure 387 shows that w = 1/z maps horizontal and vertical straight lines onto circles or straight lines. Even 
the following is true. 


w = 1/z maps every straight line or circle onto a circle or straight line. 


Fig. 387. Mapping (Inversion) w = 1/z 
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THEOREM 1 


PROOF 
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Proof. Every straight line or circle in the z-plane can be written 


A(x? + y?) + Bx + Cy +D =0 (A, B C, D real). 


A = 0 gives a straight line and A # 0 a circle. In terms of z and Z this equation becomes 


zt+z ZZ 
oe + C + D=0. 
ae 2i 


Now w = 1/z. Substitution of z = 1/w and multiplication by ww gives the equation 


wt+w w—-w — 
A+B PC + Dww = 0 
2 2i 


or, in terms of u and v, 


A+ Bu — Cv + Du? + v7) = 0. 
This represents a circle (if D # 0) or a straight line (if D = 0) in the w-plane. ie] 
The proof in this example suggests the use of z and z instead of x and y, a general principle 


that is often quite useful in practice. 
Surprisingly, every linear fractional transformation has the property just proved: 


Circles and Straight Lines 


Every linear fractional transformation (1) maps the totality of circles and straight 
lines in the z-plane onto the totality of circles and straight lines in the w-plane. 


This is trivial for a translation or rotation, fairly obvious for a uniform expansion or 
contraction, and true for w = 1/z, as just proved. Hence it also holds for composites of 
these special mappings. Now comes the key idea of the proof: represent (1) in terms of 
these special mappings. When c = 0, this is easy. When c # 0, the representation is 

1 a _ ad — be 


where K= 


WS eg se Cc 


This can be verified by substituting K, taking the common denominator and simplifying; 
this yields (1). We can now set 


1 
W4 = CZ, Wo = wt d, Wa = Wyo? wa = Kws, 


and see from the previous formula that then w = wa + a/c. This tells us that (1) is indeed 
a composite of those special mappings and completes the proof. |_| 


Extended Complex Plane 


The extended complex plane (the complex plane together with the point © in Sec. 16.2) 
can now be motivated even more naturally by linear fractional transformations as follows. 

To each z for which cz + d # 0 there corresponds a unique w in (1). Now let c # 0. 
Then for z = —d/c we have cz + d = 0, so that no w corresponds to this z. This suggests 
that we let w = © be the image of z = —d/c. 
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Also, the inverse mapping of (1) is obtained by solving (1) for z; this gives again a 
linear fractional transformation 


= dw — b 
(4) ae eee a 


When c # 0, then cw — a = 0 for w = a/c, and we let a/c be the image of z = %. With 

these settings, the linear fractional transformation (1) is now a one-to-one mapping of the 

extended z-plane onto the extended w-plane. We also say that every linear fractional 

transformation maps “the extended complex plane in a one-to-one manner onto itself.” 
Our discussion suggests the following. 


General Remark. If z = , then the right side of (1) becomes the meaningless expression 
(a: © + b)/(c + © + d). We assign to it the value w = a/cifc # Oandw = ~ifc = 0. 


Fixed Points 


Fixed points of a mapping w = f(z) are points that are mapped onto themselves, are “kept 
fixed” under the mapping. Thus they are obtained from 


w = f(z) = z. 


The identity mapping w = z has every point as a fixed point. The mapping w = Z has 

infinitely many fixed points, w = 1/z has two, a rotation has one, and a translation none 

in the finite plane. (Find them in each case.) For (1), the fixed-point condition w = z is 
az +b 


= th - b= 0: 
(5) Zz one. us CZ (a — d)z 0) 


For c # 0 this is a quadratic equation in z whose coefficients all vanish if and only if the 
mapping is the identity mapping w = z (in this case,a = d # 0, b = c = 0). Hence we have 


Fixed Points 


A linear fractional transformation, not the identity, has at most two fixed points. If 
a linear fractional transformation is known to have three or more fixed points, it must 
be the identity mapping w = z. 


To make our present general discussion of linear fractional transformations even more 
useful from a practical point of view, we extend it by further facts and typical examples, 
in the problem set as well as in the next section. 


PROBLEEM—SET 17-2 


1. Verify the calculations in the proof of Theorem 1, 3. Matrices. If you are familiar with 2 X 2 matrices, 
including those for the case c = 0. prove that the coefficient matrices of (1) and (4) are 
2. Composition of LFTs. Show that substituting a linear inverses of each other, provided that ad — bc = 1, and 
fractional transformation (LFT) into an LFT gives that the composition of LFTs corresponds to the 


an LFT. 


multiplication of the coefficient matrices. 
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4. Fig. 387. Find the image of x = k = const under 11-16 | FIXED POINTS 


w = 1/z. Hint. Use formulas similar to those in Find the fixed points. 
E le 1. 
ane 1. w = (at ib)? 
5. Inverse. Derive (4) from (1) and conversely. v 31 
~w=7- 3 
6. Fixed points. Find the fixed points mentioned in the B ee 
text before formula (5). ee 
14. w=az+b 
INVERSE peu 
Find the inverse z = z(w). Check by solving z(w) for w. 15. w = 2 — 5i 
i 
7w= ; 
- aiz= 1 
2z— 1 16. w= ag or 
= zt+ai 
8. w= z : 
Zod 
17-20| FIXED POINTS 
9 w= 2 = + Find all LFTs with fixed point(s). 
ee 17. <= 18. z= +1 
10. » =——- 19. <= +i 
siz — 1 20. Without any fixed points 


17.3 Special Linear Fractional Transformations 


THEOREM -1 


PROOF 


We continue our study of linear fractional transformations. We shall identify linear fractional 
transformations 


az +b 


(1) ae (ad — be # 0) 


that map certain standard domains onto others. Theorem | (below) will give us a tool for 
constructing desired linear fractional transformations. 

A mapping (1) is determined by a, b, c, d, actually by the ratios of three of these constants 
to the fourth because we can drop or introduce a common factor. This makes it plausible 
that three conditions determine a unique mapping (1): 


Three Points and Their Images Given 


Three given distinct points z1, Z2, Z3 can always be mapped onto three prescribed 
distinct points w1, Wz, W3 by one, and only one, linear fractional transformation 
w = f(z). This mapping is given implicitly by the equation 


W—-W, W2—- W3 L— 2 £9) 23 
W—-W3 We- Wy OS ey Gy Sa 


(2) 


Uf one of these points is the point ©, the quotient of the two differences containing 
this point must be replaced by 1.) 


Equation (2) is of the form F(w) = G(z) with linear fractional F and G. Hence 
w=F —1(G(z)) = f(z), where F ~1 is the inverse of F and is linear fractional (see (4) in 
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Sec. 17.2) and so is the composite F-\(G (z)) (by Prob. 2 in Sec. 17.2), that is, w = f(z) 
is linear fractional. Now if in (2) we set w = wy, Wa, wg on the left and z = 21, Za, z3 on 
the right, we see that 


Fw) =0, = Fw) = 1, F(w3) = © 
G(z1) = 0, G(z2) = 1, G(z3) = ©. 


From the first column, F(w1) = G(z}), thus w, = FUG(z)) = f(z1). Similarly, wo = f(z2), 
w3 = f(z3). This proves the existence of the desired linear fractional transformation. 

To prove uniqueness, let w = g(z) be a linear fractional transformation, which also 
maps z; onto w;, j = 1, 2,3. Thus w; = g(z;). Hence g (ws) = z;, where w; = f(zj). 
Together, g ( S(Zj)) = Zj, a mapping with the three fixed points z1, Z2, z3. By Theorem 2 
in Sec. 17.2, this is the identity mapping, g *(f() = z for all z. Thus f(z) = g(z) for all 
z, the uniqueness. 

The last statement of Theorem 1 follows from the General Remark in Sec. 17.2. 


Mapping of Standard Domains by Theorem 1 


Using Theorem 1, we can now find linear fractional transformations of some practically 
useful domains (here called “standard domains”) according to the following principle. 


Principle. Prescribe three boundary points z1, zz, z3 of the domain D in the z-plane. 
Choose their images w 1, we, w3 on the boundary of the image D* of D in the w-plane. 
Obtain the mapping from (2). Make sure that D is mapped onto D*, not onto its 
complement. In the latter case, interchange two w-points. (Why does this help?) 


Fig. 388. Linear fractional transformation in Example 1 


Mapping of a Half-Plane onto a Disk (Fig. 388) 


Find the linear fractional transformation (1) that maps z; = —1, zg = 0,z3 = 1 onto wy = —1, we = —i, 
wW3 = |, respectively. 


Solution. From (2) we obtain 
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EXAMPLE 3 


EXAMPLE 4 
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thus 


Z-4 
—ig+1 

Let us show that we can determine the specific properties of such a mapping without much calculation. For 
z = x we have w = (x — i)/(—ix + 1), thus |w| = 1, so that the x-axis maps onto the unit circle. Since z = i 
gives w = 0, the upper half-plane maps onto the interior of that circle and the lower half-plane onto the exterior. 
z= 0,i, © goontow = —i, 0, i, so that the positive imaginary axis maps onto the segment S:u = 0, -1 Sv S31. 
The vertical lines x = const map onto circles (by Theorem 1, Sec. 17.2) through w = i (the image of z = ~) and 
perpendicular to |w| = 1 (by conformality; see Fig. 388). Similarly, the horizontal lines y = const map onto 
circles through w = i and perpendicular to S (by conformality). Figure 388 gives these circles for y 2 0, and for 
y <0 they lie outside the unit disk shown. fai] 


Occurrence of 


Determine the linear fractional transformation that maps z, = 0, zg = 1,z3 = © onto wy = —1, we = —i, 
w3 = |, respectively. 


Solution. From (2) we obtain the desired mapping 


This is sometimes called the Cayley transformation.” In this case, (2) gave at first the quotient (1 — ~)/(z — ©), 
which we had to replace by 1. 
Mapping of a Disk onto a Half-Plane 


Find the linear fractional transformation that maps z; = —1, zg = i, zg = 1 onto wy = 0, we = i, wg = %, 
respectively, such that the unit disk is mapped onto the right half-plane. (Sketch disk and half-plane.) 


Solution. From (2) we obtain, after replacing (i — ~)/(w — ©) by 1, 


ge] 
i 


z-1. 


Mapping half-planes onto half-planes is another task of practical interest. For instance, 
we may wish to map the upper half-plane y = O onto the upper half-plane v 2 0. Then 
the x-axis must be mapped onto the w-axis. 


Mapping of a Half-Plane onto a Half-Plane 


Find the linear fractional transformation that maps zj = —2, zg = 0,z3 = 2 onto wy = ©, wo = a wW3= 3 
respectively. 


Solution. You may verify that (2) gives the mapping function 


gate 1 
w= 
2z+4 


What is the image of the x-axis? Of the y-axis? || 
Mappings of disks onto disks is a third class of practical problems. We may readily 


verify that the unit disk in the z-plane is mapped onto the unit disk in the w-plane by the 
following function, which maps Zo onto the center w = 0. 


?ARTHUR CAYLEY (1821-1895), English mathematician and professor at Cambridge, is known for his 
important work in algebra, matrix theory, and differential equations. 
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EXAMPLE 6 


He — Fey 


(3) . a 


; lze| <1. 
Ge= Il 


NI 
S 


To see this, take |z| = 1, obtaining, with c = Zo as in (3), 


Iz — zol = |z-—cl 


= |zlIz—- el 

= |zz — cz| = |1 — cz| = |ez — 1. 
Hence 

lw] = Iz — zol/lez — 1] = 1 


from (3), so that lz] = 1 maps onto lw| = 1, as claimed, with Zo going onto 0, as the 
numerator in (3) shows. 

Formula (3) is illustrated by the following example. Another interesting case will be 
given in Prob. 17 of Sec. 18.2. 


Mapping of the Unit Disk onto the Unit Disk 


Taking zo = 5 in (3), we obtain (verify!) 


(Fig. 389). Mf 


Fig. 389. Mapping in Example 5 


Mapping of an Angular Region onto the Unit Disk 


Certain mapping problems can be solved by combining linear fractional transformations with others. For instance, 
to map the angular region D: —7r/6 S arg z S 77/6 (Fig. 390) onto the unit disk |w| S 1, we may map D by 
Z = 2? onto the right Z-half-plane and then the latter onto the disk |w| S 1 by 


w=i r combined w=i 
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(z-plane) 


(Z-plane) 


(w-plane) 


Fig. 390. Mapping in Example 6 


This is the end of our discussion of linear fractional transformations. In the next section 
we turn to conformal mappings by other analytic functions (sine, cosine, etc.). 


PROBLEM SET 17-3 


. CAS EXPERIMENT. Linear Fractional Transfor- 


mations (LFTs). (a) Graph typical regions (squares, 
disks, etc.) and their images under the LFTs in 
Examples 1-5 of the text. 


(b) Make an experimental study of the continuous 
dependence of LFTs on their coefficients. For instance, 
change the LFT in Example 4 continuously and graph 
the changing image of a fixed region (applying animation 
if available). 


. Inverse. Find the inverse of the mapping in Example 1. 


Show that under that inverse the lines x = const are 
the images of circles in the w-plane with centers on the 
line v = 1. 


. Inverse. If w = f(z) is any transformation that has an 


inverse, prove the (trivial!) fact that f and its inverse 
have the same fixed points. 


. Obtain the mapping in Example 1 of this section from 


Prob. 18 in Problem Set 17.2. 


. Derive the mapping in Example 2 from (2). 


6. Derive the mapping in Example 4 from (2). Find its 


inverse and the fixed points. 


. Verify the formula for disks. 


8-16 


LFTs FROM THREE POINTS AND IMAGES 


Find the LFT that maps the given three points onto the three 
given points in the respective order. 


18. 
. Find an analytic function w = f(z) that maps the region 


20. 


. 0, 1, 2 onto 1, 4,4 

. 1, i, —1 onto i, -—1, -i 

. O, —i,i onto —1,0, 

. —1,0, 1 onto -i, -1,i 

. O, 21, —2i onto —1, 0, 

. 0, 1, © onto ©, 1,0 

. —1,0,1 ontol,1 + 7,14 2i 
15. 
. —3,0, 1 onto 0, 3, 1 


. Find an LFT that maps lz] = 1 onto |w| S 1 so that 


1, i, 2 onto 0, —i — 1, -4 


z = i/2 is mapped onto w = 0. Sketch the images of 
the lines x = const and y = const. 


Find all LFTs w(z) that map the x-axis onto the w-axis. 


0 S arg z S 77/4 onto the unit disk |w| S 1. 

Find an analytic function that maps the second quadrant 
of the z-plane onto the interior of the unit circle in the 
w-plane. 


17.4 Conformal Mapping by Other Functions 


We shall now cover mappings by trigonometric and hyperbolic analytic functions. So far 
we have covered the mappings by z” and e* (Sec. 17.1) as well as linear fractional 
transformations (Secs. 17.2 and 17.3). 


Sine Function. Figure 391 shows the mapping by 


(1) 


w=u-+t iv = sinz = sinxcoshy + icos x sinh y 


(Sec. 13.6). 
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NIA 


(z-plane) (w-plane) 
Fig. 391. Mapping w = u + iv = sinz 


Hence 
(2) u = sinx cosh y, v = cos x sinh y. 


Since sin z is periodic with period 277, the mapping is certainly not one-to-one if we 
consider it in the full z-plane. We restrict z to the vertical strip S: 30 =x 37 in 
Fig. 391. Since f(z) = cos z = Oat z +3 77, the mapping is not conformal at these two 
critical points. We claim that the rectangular net of straight lines x = const and y = const 
in Fig. 391 is mapped onto a net in the w-plane consisting of hyperbolas (the images of 
the vertical lines x = const) and ellipses (the images of the horizontal lines y = const) 
intersecting the hyperbolas at right angles (conformality!). Corresponding calculations are 


simple. From (2) and the relations sin? x + cos” x = 1 and cosh? yo sinh? y= 1 we 


obtain 
uz v? 
5 = cosh” y= sinh” y=1 (Hyperbolas) 
sin” x cos’ x 
2 2 
Uu a) : : 
z — sin? x + cos? x = 1 (Ellipses). 
cosh“ y sinh” y 
Exceptions are the vertical lines x = — xX = 577, which are “folded” onto u S —1 and 


u = 1(v = O), respectively. 

Figure 392 illustrates this further. The upper and lower sides of the rectangle are mapped 
onto semi-ellipses and the vertical sides onto —cosh | Su S —1 and 1 =u Scoshl 
(v = 0), respectively. An application to a potential problem will be given in Prob. 3 of 
Sec. 18.2. 


y VU 
Cc 1 B 
D A C* B* 
a= a x E* Fe u 
2 2 
E -l F 


Fig. 392. Mapping by w = sinz 
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Cosine Function. The mapping w = cos z could be discussed independently, but since 
(3) w =cosz = sin (z + 47), 


we see at once that this is the same mapping as sin z preceded by a translation to the right 
through 37 units. 


Hyperbolic Sine. Since 
(4) w = sinh z = —isin (iz), 


the mapping is a counterclockwise rotation Z = iz through 377 (i.e., 90°), followed by the 
sine mapping Z* = sin Z, followed by a clockwise 90°-rotation w = —iZ™. 


Hyperbolic Cosine. This function 
(5) w = cosh z = cos (iz) 


defines a mapping that is a rotation Z = iz followed by the mapping w = cos Z. 

Figure 393 shows the mapping of a semi-infinite strip onto a half-plane by w = cosh z. 
Since cosh 0 = 1, the point z = O is mapped onto w = 1. For real z = x 2 0, cosh z is 
real and increases with increasing x in a monotone fashion, starting from 1. Hence the 
positive x-axis is mapped onto the portion u = 1 of the u-axis. 

For pure imaginary z = iy we have cosh iy = cos y. Hence the left boundary of the strip 
is mapped onto the segment | = u = —1 of the u-axis, the point z = 7ri corresponding to 


w = coshi7T = cos7 = —1. 


On the upper boundary of the strip, y = 7, and since sin 77 = O and cos 7 = —1, it 
follows that this part of the boundary is mapped onto the portion u = —1 of the uw-axis. 
Hence the boundary of the strip is mapped onto the u-axis. It is not difficult to see that 
the interior of the strip is mapped onto the upper half of the w-plane, and the mapping is 
one-to-one. 

This mapping in Fig. 393 has applications in potential theory, as we shall see in Prob. 12 
of Sec. 18.3. 


Fig. 393. Mapping by w = coshz 


Tangent Function. Figure 394 shows the mapping of a vertical infinite strip onto the 
unit circle by w = tan z, accomplished in three steps as suggested by the representation 
(Sec. 13.6) 

sinz (e®@-e)/i (e* — 1/i 


w = tanz ; = : 
COS Z el +e 1z ez | 
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Hence if we set Z = e?” and use 1/i = —i, we have 
: Z 1 iz 
= tanz = —iW, Ww= Z=e™, 
(6) Ww an Zz iW, 741? e 


We now see that w = tan zis a linear fractional transformation preceded by an exponential 
mapping (see Sec. 17.1) and followed by a clockwise rotation through an angle 377(90°). 
The strip is S$: — a7 <i £7, and we show that it is mapped onto the unit disk in 
the w-plane. Since Z = et — ep 2U +2 We see from (10) in Sec. 13.5 that |Z| = e~Y, 
Arg Z = 2x. Hence the vertical lines x = —7/4,0, 77/4 are mapped onto the rays 
Arg Z = —77/2, 0, 77/2, respectively. Hence S is mapped onto the right Z-half-plane. Also 
Zee if y > Oand |Z| > 1 if y < 0. Hence the upper half of S is mapped inside 
the unit circle |Z] = 1 and the lower half of S outside |Z| = 1, as shown in Fig. 394. 
Now comes the linear fractional transformation in (6), which we denote by g(Z): 


Z= i 
Z+1° 


(7) W= gZ) = 


For real Z this is real. Hence the real Z-axis is mapped onto the real W-axis. Furthermore, 
the imaginary Z-axis is mapped onto the unit circle |W| = 1 because for pure imaginary 
Z = iY we get from (7) 


|W| = |g@¥)| = = 1, 


iY+1 


sr 


The right Z-half-plane is mapped inside this unit circle |W| = 1, not outside, because 
Z = | has its image g(1) = 0 inside that circle. Finally, the unit circle |Z| = 1 is mapped 
onto the imaginary W-axis, because this circle is Z = e’®, so that (7) gives a pure imaginary 
expression, namely, 

ec? —1 — etb/% _ ei#/2 i sin (/2) 


ib — Q — . . . 
) ert 1 et/% 4 e/2 cos (/2) 


g(e 


From the W-plane we get to the w-plane simply by a clockwise rotation through 77/2; see (6). 

Together we have shown that w = tan z maps S: —7/4 < Rez < 77/4 onto the unit 
disk |w| < 1, with the four quarters of S mapped as indicated in Fig. 394. This mapping 
is conformal and one-to-one. 


(z-plane) (Z-plane) (W-plane) (w-plane) 


Fig. 394. Mapping by w = tanz 
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PROBLEM SET 17-4 


CONFORMAL MAPPING w = e” 


1. Find the image of x = c = const, —7 < y S 7, under 


17. Find an analytic function that maps the region R 
bounded by the positive x- and y-semi-axes and the 
hyperbola xy = 7 in the first quadrant onto the upper 


oe half-plane. Hint. First map R onto a horizontal strip. 


2. Find the image of y = k = const, —%® < x S ™, under 
w=e. CONFORMAL MAPPING w = cos z 


3-7 | Find and sketch the image of the given region 18. 
under w = e’*. 


k = const under the 


Find the images of the lines y 
mapping w = cos z. 


3. $3 Sxs 3. —TSysT7 19. Find the images of the lines x = c = const under the 

4.0<x<1, 3<y<1 mapping w = Cos z. 

5. -~m<x< mo, OS ys=27 . ; : 
20-23 | Find and sketch or graph the image of the given 

6.0<x<%», O0<y<7/2 : : 

° region under the mapping w = cos z. 
7.0<x<1, O<y<7 ‘ 
8. CAS EXPERIMENT. Conformal Mapping. If your 22 9<*<2% g<y<!1 


21.0<x< 7/2, 0<y <2 directly and from Prob. 11 
-l<x<1, OSy=1 
T<x<27, y<0 


. Find and sketch the image of the region 2 S |z| S 3, 
7/4 = 0 S 7/2 under the mapping w = Ln z. 


CAS can do conformal mapping, use it to solve Prob. 7. 
Then increase y beyond 77, say, to 5077 or 10077. State 22. 
what you expected. See what you get as the image. 23. 
Explain. 24 


CONFORMAL MAPPING w = sinz 


9. Find the points at which w = sin z is not conformal. 25. 


gS 
Show that w = Ln — 7; maps the upper half-plane 
4 


10. Sketch or graph the images of the lines x = 0, +77/6, 


: : < = ; 
+77/3, +77/2 under the mapping w = sin z. onto the horizontal strip 0 S Im w = 77 as shown in 


the figure. 
11-14| Find and sketch or graph the image of the given 

region under w = sin z. A B COD E 
1. 0<x<7/2, 0<y<2 o- =e 
12. -7/4<x<7/4, 0<y<1 epiale! 
13.0<x<27m7, 1<y<3 ui 
14.0<x<7/6, -~<y<o Cc 
15. Describe the mapping w = cosh z in terms of the map- Doc) E* = A* B*(co) 

ping w = sin z and rotations and translations. 0 
16. Find all points at which the mapping w = cosh 27rz is (w-plane) 

not conformal. Problem 25 


17.5: Riemann Surfaces. Optional 

One of the simplest but most ingeneous ideas in complex analysis is that of Riemann 
surfaces. They allow multivalued relations, such as w = Vz or w = Inz, to become 
single-valued and therefore functions in the usual sense. This works because the Riemann 
surfaces consist of several sheets that are connected at certain points (called branch points). 
Thus w = Vz will need two sheets, being single-valued on each sheet. How many sheets 
do you think w = Inz needs? Can you guess, by recalling Sec. 13.7? (The answer will 
be given at the end of this section). Let us start our systematic discussion. 

The mapping given by 
w=utiv = 2 


(1) (Sec. 17.1) 
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is conformal, except at z = 0, where w’ = 2z =0, Atz=0, angles are doubled under 
the mapping. Thus the right z-half-plane (including the positive y-axis) is mapped onto 
the full w-plane, cut along the negative half of the u-axis; this mapping is one-to-one. 
Similarly for the left z-half-plane (including the negative y-axis). Hence the image of the 
full z-plane under w = z” “covers the w-plane twice” in the sense that every w # 0 is the 
image of two z-points; if zy is one, the other is —z,. For example, z = i and —i are both 
mapped onto w = —1. 

Now comes the crucial idea. We place those two copies of the cut w-plane upon each 
other so that the upper sheet is the image of the right half z-plane R and the lower sheet 
is the image of the left half z-plane L. We join the two sheets crosswise along the cuts 
(along the negative u-axis) so that if z moves from R to L, its image can move from the 
upper to the lower sheet. The two origins are fastened together because w = 0 is the image 
of just one z-point, z = 0. The surface obtained is called a Riemann surface (Fig. 395a). 
w = 0 is called a “winding point” or branch point. w = z7 maps the full z-plane onto 
this surface in a one-to-one manner. 

By interchanging the roles of the variables z and w it follows that the double-valued 
relation 


(2) w=Vz (Sec. 13.2) 


becomes single-valued on the Riemann surface in Fig. 395a, that is, a function in the usual 
sense. We can let the upper sheet correspond to the principal value of Vz. Its image is 
the right w-half-plane. The other sheet is then mapped onto the left w-half-plane. 


3 
(a) Riemann surface of Vz (b) Riemann surface of Vz 


Fig. 395. Riemann surfaces 


Similarly, the triple-valued relation w = Wz becomes single-valued on the three-sheeted 
Riemann surface in Fig. 395b, which also has a branch point at z = 0. 
The infinitely many-valued natural logarithm (Sec. 13.7) 


w=Inz=Lnz+ 2n77i (n = 0, £1, +2,---) 


becomes single-valued on a Riemann surface consisting of infinitely many sheets, 
w = Lnz corresponds to one of them. This sheet is cut along the negative x-axis and the 
upper edge of the slit is joined to the lower edge of the next sheet, which corresponds to 
the argument 77 < 6 S 377, that is, to 


w=Lnz+ 2771. 


The principal value Ln z maps its sheet onto the horizontal strip —77 <v S 7. The 
function w = Ln z + 277i maps its sheet onto the neighboring strip 7 < v S$ 377, and so 
on. The mapping of the points z # 0 of the Riemann surface onto the points of the w-plane 
is one-to-one. See also Example 5 in Sec. 17.1. 
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1. If z moves from z = t twice around the circle |z| = i, 
what does w = Vz do? 

2. Show that the Riemann surface of w= 
V(z — 1)(z — 2) has branch points at 1 and 2 sheets, 
which we may cut and join crosswise from 1 to 2. 
Hint. Introduce polar coordinates z — 1 = rye“ and 
2-25 roe’”?, so that w = Vryre ert 62)/2. 

3. Make a sketch, similar to Fig. 395, of the Riemann 
surface of w = Wz + I. 


PROBLEM SET 17-5 


4-10| RIEMANN SURFACES 


Find the branch points and the number of sheets of the 
Riemann surface. 


4. Viz—24i 5.22 + Wace ti 
6. In (6z — 2i) 7. Wz - zo 


&. 2". Ve 
10. V(4 — 27). — z?) 


9 Ve4+2 


CHAPTER -1T7- REVIEW QUESTIONS AND PROBLEMS 


1. What is a conformal mapping? Why does it occur in 
complex analysis? 


2. At what points are w = 2 — zand w = cos (az?) not 
conformal? 

3. What happens to angles at zo under a mapping w = f(z) 
if f'Zo) = 0, f"Zo) = 0, fo) # 0? 

4. What is a linear fractional transformation? What can 
you do with it? List special cases. 

5. What is the extended complex plane? Ways of intro- 
ducing it? 

6. What is a fixed point of a mapping? Its role in this 
chapter? Give examples. 

7. How would you find the image of x = Re z = | under 
w = iz, 2”, e%, 1/2? 

8. Can you remember mapping properties of w = In z? 

9. What mapping gave the Joukowski airfoil? Explain 


details. 
10. What is a Riemann surface? Its motivation? Its simplest 
example. 
11-16] MAPPING w = z” 


Find and sketch the image of the given region or curve 
under w = 2”. 

WW. 1 < |z| <2, largz| < 7/8 

12. 1/Vi < |z)}< Va, 0<argz< 7/2 


13. -4<xy<4 14.0<y<2 
15. x= —-1,1 16. y = —2,2 
17-22} MAPPING w = 1/z 


Find and sketch the image of the given region or curve 
under w = 1/z. 


17. |z| <1 
18. |e] <1, O<argz< 7/2 
19.2<|z)<3, y>0O 20.0Sargzsm/4 


21-3) +y?=4, y>O0 
22.c=1t+iy (-“~<y<%) 


23-28} LINEAR FRACTIONAL 
TRANSFORMATIONS (LFTs) 

Find the LFT that maps 

23. —1,0, 1 onto 4 + 3i, 5i/2, 4 — 3i, respectively 


24. 0,2, 4 onto », z, z respectively 


25. 1, i, —i onto i, —1, 1, respectively 

26. 0, 1,2 onto 2i, 1 + 2i, 2 + 2i, respectively 
27. 0,1, © onto ~, 1, 0, respectively 

28. —1, —i,i onto 1 — i, 2, 0, respectively 


29-34| FIXED POINTS 
Find the fixed points of the mapping 


29. w=(2+ iz 30. w=c*+7- 64 
31. w = Be +2/(<- 1) 32. Qiz — 1)/(z + 2i) 
33. w = 2? + 10z3 + 10z 


34. w = (iz + 5/52 +i 


35-40} GIVEN REGIONS 

Find an analytic function w = f(z) that maps 

35. The infinite strip 0 < y < 77/4 onto the upper half- 
plane v > 0. 

36. The quarter-disk |z| < 1, x > 0, y > Oonto the exterior 
of the unit circle |w| = 1. 

37. The sector 0 < arg z < 7/2 onto the region u < 1. 

38. The interior of the unit circle |z| = 1 onto the exterior 
of the circle |w + 2] = 2. 

39. The region x > 0, y > 0,xy <c onto the strip 0 < 
v<l. 

40. The semi-disk |z] < 2, y > 0 onto the exterior of the 

circle |w — 7| = 7. 
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SUMMARY—-OF CHAPTER +7 


Conformal Mapping 


A complex function w = f(z) gives a mapping of its domain of definition in the 
complex z-plane onto its range of values in the complex w-plane. If f(z) is analytic, 
this mapping is conformal, that is, angle-preserving: the images of any two 
intersecting curves make the same angle of intersection, in both magnitude and sense, 
as the curves themselves (Sec. 17.1). Exceptions are the points at which f(z) = 0 
(“critical points,” e.g. z = 0 for w = 2), 

For mapping properties of e*, cos z, sin z etc. see Secs. 17.1 and 17.4. 

Linear fractional transformations, also called Mébius transformations 


1 avert (Secs. 17.2, 17.3) 
(1) Wade ecs. 17.2, 17. 


(ad — bc # 0) map the extended complex plane (Sec. 17.2) onto itself. They solve 
the problems of mapping half-planes onto half-planes or disks, and disks onto disks 
or half-planes. Prescribing the images of three points determines (1) uniquely. 

Riemann surfaces (Sec. 17.5) consist of several sheets connected at certain points 
called branch points. On them, multivalued relations become single-valued, that is, 
functions in the usual sense. Examples. For w = Vz we need two sheets (with branch 
point 0) since this relation is doubly-valued. For w = In z we need infinitely many 
sheets since this relation is infinitely many-valued (see Sec. 13.7). 
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CHAPTER | 8 


Complex Analysis 
and Potential Theory 


In Chapter 17 we developed the geometric approach of conformal mapping. This meant 
that, for a complex analytic function w = f(z) defined in a domain D of the z-plane, we 
associated with each point in D a corresponding point in the w-plane. This gave us a 
conformal mapping (angle-preserving), except at critical points where f(z) = 0. 

Now, in this chapter, we shall apply conformal mappings to potential problems. This 
will lead to boundary value problems and many engineering applications in electrostatics, 
heat flow, and fluid flow. More details are as follows. 

Recall that Laplace’s equation V’® = 0 is one of the most important PDEs in 
engineering mathematics because it occurs in gravitation (Secs. 9.7, 12.11), electrostatics 
(Sec. 9.7), steady-state heat conduction (Sec. 12.5), incompressible fluid flow, and other 
areas. The theory of this equation is called potential theory (although “potential” is also 
used in a more general sense in connection with gradients (see Sec. 9.7)). Because we 
want to treat this equation with complex analytic methods, we restrict our discussion to 
the “two-dimensional case.” Then ® depends only on two Cartesian coordinates x and y, 
and Laplace’s equation becomes 


VD = Dyy + Pyy = 0. 


An important idea then is that its solutions ® are closely related to complex analytic 
functions ® + iW as shown in Sec. 13.4. (Remark: We use the notation ® + iV to free 
u and v, which will be needed in conformal mapping u + iv.) This important relation is 
the main reason for using complex analysis in problems of physics and engineering. 

We shall examine this connection between Laplace’s equation and complex analytic 
functions and illustrate it by modeling applications from electrostatics (Secs. 18.1, 
18.2), heat conduction (Sec. 18.3), and hydrodynamics (Sec. 18.4). This in turn will 
lead to boundary value problems in two-dimensional potential theory. As a result, 
some of the functions of Chap. 17 will be used to transform complicated regions into 
simpler ones. 

Section 18.5 will derive the important Poisson formula for potentials in a circular disk. 
Section 18.6 will deal with harmonic functions, which, as you recall, are solutions of 
Laplace’s equation and have continuous second partial derivatives. In that section we will 
show how results on analytic functions can be used to characterize properties of harmonic 
functions. 


Prerequisite: Chaps. 13, 14, 17. 
References and Answers to Problems: App. | Part D, App. 2. 
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18.1 Electrostatic Fields 


Fig. 396. Potential 
in Example 1 


EXAMPLE 1 


EXAMPLE 2 


The electrical force of attraction or repulsion between charged particles is governed by 
Coulomb’s law (see Sec. 9.7). This force is the gradient of a function ®, called the 
electrostatic potential. At any points free of charges, ® is a solution of Laplace’s equation 


V7 = 0. 


The surfaces ® = const are called equipotential surfaces. At each point P at which 
the gradient of ® is not the zero vector, it is perpendicular to the surface ® = const 
through P; that is, the electrical force has the direction perpendicular to the equipotential 
surface. (See also Secs. 9.7 and 12.11.) 

The problems we shall discuss in this entire chapter are two-dimensional (for the 
reason just given in the chapter opening), that is, they model physical systems that lie 
in three-dimensional space (of course!), but are such that the potential ® is independent 
of one of the space coordinates, so that ® depends only on two coordinates, which we 
call x and y. Then Laplace’s equation becomes 


2 2 
() yo -2 2, 9 _ 
ax? ay 


0. 


Equipotential surfaces now appear as equipotential lines (curves) in the xy-plane. 
Let us illustrate these ideas by a few simple examples. 


Potential Between Parallel Plates 


Find the potential ® of the field between two parallel conducting plates extending to infinity (Fig. 396), which 
are kept at potentials ®, and ®y, respectively. 


Solution. From the shape of the plates it follows that ® depends only on x, and Laplace’s equation becomes 
©” = 0. By integrating twice we obtain ® = ax + b, where the constants a and b are determined by the given 
boundary values of ® on the plates. For example, if the plates correspond to x = —1 and x = 1, the solution is 


(x) = 3(Dy — Dy)x + 9(Dy + Dj). 


The equipotential surfaces are parallel planes. si] 


Potential Between Coaxial Cylinders 


Find the potential ® between two coaxial conducting cylinders extending to infinity on both ends (Fig. 397) 
and kept at potentials ©; and Wg, respectively. 


Solution. Were ® depends only on r = Vx? + y?, for reasons of symmetry, and Laplace’s equation 
Up, + ry + Ugg = 0 [(5), Sec. 12.10] with ugg = 0 and u = ® becomes rb” + @! = 0, By separating 
variables and integrating we obtain 


©" 1 : ee , a 
ee In ®’ = -Inr+a, ®@ =-, ®=alnr+b 
® r ig 


and a and b are determined by the given values of ® on the cylinders. Although no infinitely extended conductors 
exist, the field in our idealized conductor will approximate the field in a long finite conductor in that part which 
is far away from the two ends of the cylinders. B 
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EXAMPLE 3 


7 


Fig. 398. Potential 
in Example 3 
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Fig. 397. Potential in Example 2 


Potential in an Angular Region 


Find the potential ® between the conducting plates in Fig. 398, which are kept at potentials @, (the lower plate) 
and ®g, and make an angle a, where 0 < a S 7. (In the figure we have a = 120° = 277/3.) 


Solution. 6 = Argz(z = x + iy ¥ 0) is constant on rays @ = const. It is harmonic since it is the imaginary 
part of an analytic function, Ln z (Sec. 13.7). Hence the solution is 


D(x, y) =at+ bArgz 
with a and b determined from the two boundary conditions (given values on the plates) 
at b(—3a) = Q,, at ba) = Qp,. 


Thus a = (®g + ®4)/2, b = (Bg — ®,)/a. The answer is 


y 


1 1 y 
D(x, y) 5) (Dy + ®,) 4 a (Dy — D1) 8, = arctan —. | 


Complex Potential 


Let ®(x, y) be harmonic in some domain D and V(x, y) a harmonic conjugate of ® in D. 
(Note the change of notation from u and v of Sec. 13.4 to ® and V. From the next section 
on, we had to free u and vu for use in conformal mapping. Then 


(2) F(z) = B(x, y) + iW, y) 


is an analytic function of z = x + iy. This function F is called the complex potential 
corresponding to the real potential ®. Recall from Sec. 13.4 that for given ®, a conjugate 
W is uniquely determined except for an additive real constant. Hence we may say the 
complex potential, without causing misunderstandings. 

The use of F has two advantages, a technical one and a physical one. Technically, F 
is easier to handle than real or imaginary parts, in connection with methods of complex 
analysis. Physically, WV has a meaning. By conformality, the curves VW = const intersect 
the equipotential lines ® = const in the xy-plane at right angles [except where F(z) = OJ. 
Hence they have the direction of the electrical force and, therefore, are called lines 
of force. They are the paths of moving charged particles (electrons in an electron 
microscope, etc.). 
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EXAMPLE 4 


EXAMPLE 5 


EXAMPLE 6 


EXAMPLE 7 


Complex Potential 


In Example 1, a conjugate is VW = ay. It follows that the complex potential is 


F(z) =az+b=ax+b + iay, 


and the lines of force are horizontal straight lines y = const parallel to the x-axis. sl] 


Complex Potential 


1 1 


In Example 2 we have ® = alnr+b=aln|z| + b. A conjugate is V = a Arg z. Hence the complex 
potential is 


F(z) =aLnz+b 


and the lines of force are straight lines through the origin. F(z) may also be interpreted as the complex potential 
of a source line (a wire perpendicular to the xy-plane) whose trace in the xy-plane is the origin. 


Complex Potential 


In Example 3 we get F(z) by noting that i Ln z = iln |z| — Arg z, multiplying this by —b, and adding a: 


F(z) =a — ibLnz=a +b Argz — ibin|zI. 


We see from this that the lines of force are concentric circles |z| = const. Can you sketch them? ie 


Superposition 


More complicated potentials can often be obtained by superposition. 


Potential of a Pair of Source Lines (a Pair of Charged Wires) 


Determine the potential of a pair of oppositely charged source lines of the same strength at the points z = c and 
z = —c on the real axis. 


Solution. From Examples 2 and 5 it follows that the potential of each of the source lines is 


®,=Kin|z-—c| and ®. = —KIn|z+c 


> 


respectively. Here the real constant K measures the strength (amount of charge). These are the real parts of the 
complex potentials 


Fi@)=Kin@-c) and Faz) = ~KLn@ + 0). 


Hence the complex potential of the combination of the two source lines is 


(3) F(z) = Fy(z) + Fo(z) = K [Ln (z — c) — Ln (z + ©)]. 
The equipotential lines are the curves 


2=£ 
zt+e 


= const. 


® = Re F(z) = K In | const, thus 


These are circles, as you may show by direct calculation. The lines of force are 


W = Im F(z) = K[Arg (z — c) — Arg (z + c)] = const. 


We write this briefly (Fig. 399) 


W = K(0; — 02) = const. 
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Now 6, — 6 is the angle between the line segments from z to c and —c (Fig. 399). Hence the lines of force 
are the curves along each of which the line segment S: —c S x S c appears under a constant angle. These curves 
are the totality of circular arcs over S, as is (or should be) known from elementary geometry. Hence the lines 
of force are circles. Figure 400 shows some of them together with some equipotential lines. 

In addition to the interpretation as the potential of two source lines, this potential could also be thought of as 
the potential between two circular cylinders whose axes are parallel but do not coincide, or as the potential 
between two equal cylinders that lie outside each other, or as the potential between a cylinder and a plane wall. 
Explain this using Fig. 400. | 


The idea of the complex potential as just explained is the key to a close relation of potential 
theory to complex analysis and will recur in heat flow and fluid flow. 


Fig. 399. Arguments in Example 7 


PROBLEM SET 18-1 


1-4} COAXIAL CYLINDERS 


Find and sketch the potential between two coaxial cylinders 
of radii ry and rg having potential U, and Us, respectively. 


Fig. 400. Equipotential lines and lines 
of force (dashed) in Example 7 


graphs, Re F(z) and Im F(z) on the same axes). Then 
explore further complex potentials of your choice with 
the purpose of discovering configurations that might 
be of practical interest. 


17 =2.5mm, ro =40cm, U, =0V, 2 
Us = 220 V (a) F(z) =z (b) F(z) = iz 
2r,=lem, m=2cm, U,=400V, U=0V © F@=1We Fe) = i/z 
3.77 = 10cm, ro =1m, U = 10kV, 9. Argument. Show that ® = 6/7 = (1/77) arctan (y/x) 
Uz = —10kV is harmonic in the upper half-plane and satisfies the 
4. Ifr, = 2m, rg = 6 cmand U, = 300 V, Up =100 V, boundary condition ®(x,0) = 1 if x <0 and 0 if 
respectively, is the potential at r = 4cm equal to x > 0, and the corresponding complex potential is 
200 V? Less? More? Answer without calculation. Then F(z) = —(i/77) Lnz. 
calculate and explain. 10. Conformal mapping. Map the upper z-half-plane 
5-7 PARALLEL PLATES onto |w| = 1 so that 0, ©, —1 are mapped onto 1, i, —i, 
: . respectively. What are the boundary conditions on 
Find and sketch the potential between the parallel plates = : oe: 
hase ientiats Di anid Ue. Find thi 1 ieiitial |w| = 1 resulting from the potential in Prob. 9? What 
aving potentials U; and Us. Find the complex potential. Piepiana w= 1 
5. Plates at xy = —Scm, xg = 5cm, potentials U, = 
250 V, Uz = 500 V, respectively. 11. Text Example 7. Verify, by calculation, that the equipo- 
6. Plates at y= x and y=x+k, potentials U, = OV, tential lines are circles. 
Uz = 220 V, respectively. 
7. Plates at xy = 12cm, xg = 24cm, potentials Uy; = 12-15| OTHER CONFIGURATIONS 
20 kV, Uz = 8 kV, respectively. 12. Find and sketch the potential between the axes 


. CAS EXPERIMENT. Complex Potentials. Graph 


the equipotential lines and lines of force in (a)—(d) (four 


(potential 500 V) and the hyperbola xy = 4 (potential 
100 V). 
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13. Arccos. Show that F(z) = arccos z (defined in Problem y | y 
Set 13.7) gives the potential of a slit in Fig. 401. 
y 
{Go i ae 
1 x x 
\| (C- 
we he ; 
Fig. 402. Other apertures 
Fig. 401. Slit 15. Sector. Find the real and complex potentials in the 
sector —77/6 = 0 = 77/6 between the boundary 0 = 
14. Arccos. Show that F(z) in Prob. 13 gives the potentials +77/6, kept at 0 V, and the curve x3 — 3xy” = 1, kept 


in Fig. 402. 


at 220 V. 


18.2 Use of Conformal Mapping. Modeling 


THEOREM 1 


PROOF 


We have just explored the close relation between potential theory and complex analysis. 
This relationship is so close because complex potentials can be modeled in complex 
analysis. In this section we shall explore the close relation that results from the use of 
conformal mapping in modeling and solving boundary value problems for the Laplace 
equation. The process consists of finding a solution of the equation in some domain, 
assuming given values on the boundary (Dirichlet problem, see also Sec. 12.6). The key 
idea is then to use conformal mapping to map a given domain onto one for which the 
solution is known or can be found more easily. This solution thus obtained is then mapped 
back to the given domain. The reason this approach works is due to Theorem 1, which 
asserts that harmonic functions remain harmonic under conformal mapping: 


Harmonic Functions Under Conformal Mapping 


Let ®* be harmonic in a domain D* in the w-plane. Suppose that w = u + iv = f(z) 
is analytic in a domain D in the z-plane and maps D conformally onto D*. Then 
the function 


(1) D(x, y) = B* (u(x, y), v(, y)) 


is harmonic in D. 


The composite of analytic functions is analytic, as follows from the chain rule. Hence, taking 
a harmonic conjugate V*(u, v) of ®*, as defined in Sec. 13.4, and forming the analytic 
function F*(w) = ®*(u, v) + iV*(u, v) we conclude that F(z) = F*( f(z) is analytic in D. 
Hence its real part P(x, y) = Re F(z) is harmonic in D. This completes the proof. 

We mention without proof that if D* is simply connected (Sec. 14.2), then a harmonic 
conjugate of ®* exists. Another proof of Theorem 1 without the use of a harmonic 
conjugate is given in App. 4. a 


764 


EXAMPLE 1 


CHAP. 18 Complex Analysis and Potential Theory 


Potential Between Noncoaxial Cylinders 


Model the electrostatic potential between the cylinders Cy: |z| = 1 and Co: |z — 2 = 2 in Fig. 403. Then give 
the solution for the case that Cy is grounded, U; = 0 V, and Cg has the potential U; = 110 V. 


Solution. We map the unit disk |z| = 1 onto the unit disk |w| = 1 in such a way that C2 is mapped onto 
some cylinder C3: |w| = ro. By (3), Sec. 17.3, a linear fractional transformation mapping the unit disk onto the 
unit disk is 


2) 22-3 
( w= ‘a4 
Uv 
eo U,=0 
U,=110V 
u 
Fig. 403. Example 1: z-plane Fig. 404. Example 1: w-plane 


where we have chosen b = Zo real without restriction. zo is of no immediate help here because centers of circles 
do not map onto centers of the images, in general. However, we now have two free constants b and ro and shall 
succeed by imposing two reasonable conditions, namely, that 0 and 4 (Fig. 403) should be mapped onto rp and 
—ro (Fig. 404), respectively. This gives by (2) 


0-b sae a 
ro = ——_ = b, and with this, To ° 2 : 
a 4b/5—1  4r9/5 —1 


a quadratic equation in ro with solutions 79 = 2 (no good because ro < 1) and rp = 3. Hence our mapping 
function (2) with b = 5 becomes that in Example 5 of Sec. 17.3, 


22=1 
(3) w= f(z) = . 
z-2 


From Example 5 in Sec. 18.1, writing w for z we have as the complex potential in the w-plane the function 
F*(w) = aLnw + k and from this the real potential 


®* (u,v) = Re F*(w) = aln|w| +k. 


This is our model. We now determine a and k from the boundary conditions. If |w| = 1, then ®* =alnl+k=0, 
hence k = 0. If |w| = ro = 3, then ®* = aln (3) = 110, hence a = 110/In (3) = —158.7. Substitution of (3) 
now gives the desired solution in the given domain in the z-plane 


2 = 
F(z) = F*(f()) = aLn 
z- 


The real potential is 


2e°= 11 


D(x, y) = Re F(z) = aln 


; a S837: 


Can we “see” this result? Well, ®(x, y) = const if and only if |(2z — 1I/jz- 2)| = const, that is, |w] = const 
by (2) with b = 3. These circles are images of circles in the z-plane because the inverse of a linear fractional 
transformation is linear fractional (see (4), Sec. 17.2), and any such mapping maps circles onto circles (or straight 
lines), by Theorem 1 in Sec. 17.2. Similarly for the rays arg w = const. Hence the equipotential lines 
(x, y) = const are circles, and the lines of force are circular arcs (dashed in Fig. 404). These two families of 
curves intersect orthogonally, that is, at right angles, as shown in Fig. 404. | 
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EXAMPLE 2 


Potential Between Two Semicircular Plates 


Model the potential between two semicircular plates P, and P: in Fig. 405 having potentials —3000 V and 
3000 V, respectively. Use Example 3 in Sec. 18.1 and conformal mapping. 


Solution. Step I. We map the unit disk in Fig. 405 onto the right half of the w-plane (Fig. 406) by using 
the linear fractional transformation in Example 3, Sec. 17.3: 


1l+z 
w=f(z%= ; 
lg 
y 
-1kV 
-—3 kV 
—2 kV 
Fig. 405. Example 2: z-plane Fig. 406. Example 2: w-plane 

The boundary |z| = 1 is mapped onto the boundary u = 0 (the v-axis), with z = —1, i, | going onto w = 0, i, ©, 
respectively, and z = —i onto w = —i. Hence the upper semicircle of |z| = 1 is mapped onto the upper half, 


and the lower semicircle onto the lower half of the v-axis, so that the boundary conditions in the w-plane are 
as indicated in Fig. 406. 


Step 2. We determine the potential ©* (w, v) in the right half-plane of the w-plane. Example 3 in Sec. 18.1 with 
a = 7, U, = —3000, and Us = 3000 [with ®* (uw, v) instead of B(x, y)] yields 


6000 v 
* = = = 
O*(u, v) Q, @ = arctan at 


On the positive half of the imaginary axis (p = 77/2), this equals 3000 and on the negative half —3000, as it 
should be. ®* is the real part of the complex potential 


F*(w) = — 


6000 i 
Ln w. 
7 


Step 3. We substitute the mapping function into F* to get the complex potential F(z) in Fig. 405 in the form 


6000 i 1+z 
Ln 


F(z) = F*( f(z) = — 


| ee ae 


The real part of this is the potential we wanted to determine: 


6000 1+z_ 6000 l+z 
D(x, y) = Re F(z) a In Ln>— = Tos: 


As in Example | we conclude that the equipotential lines ®(x, y) = const are circular arcs because they 
correspond to Arg [(1 + z)/(1 — z)] = const, hence to Arg w = const. Also, Arg w = const are rays from 0 
to ©, the images of z = —1 and z = 1, respectively. Hence the equipotential lines all have —1 and | (the 
points where the boundary potential jumps) as their endpoints (Fig. 405). The lines of force are circular arcs, 
too, and since they must be orthogonal to the equipotential lines, their centers can be obtained as intersections 
of tangents to the unit circle with the x-axis, (Explain!) Ba 


Further examples can easily be constructed. Just take any mapping w = f(z) in Chap. 17, 
a domain D in the z-plane, its image D* in the w-plane, and a potential ®* in D*. Then (1) 
gives a potential in D. Make up some examples of your own, involving, for instance, 
linear fractional transformations. 
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Basic Comment on Modeling 


We formulated the examples in this section as models on the electrostatic potential. It is 
quite important to realize that this is accidental. We could equally well have phrased 
everything in terms of (time-independent) heat flow; then instead of voltages we would 
have had temperatures, the equipotential lines would have become isotherms (= lines of 
constant temperature), and the lines of the electrical force would have become lines along 
which heat flows from higher to lower temperatures (more on this in the next section). 
Or we could have talked about fluid flow; then the electrostatic lines of force would have 
become streamlines (more on this in Sec. 18.4). What we again see here is the unifying 
power of mathematics: different phenomena and systems from different areas in physics 
having the same types of model can be treated by the same mathematical methods. What 


differs from area to area is just the kinds of problems that are of practical interest. 


PROBLEM SET 18-2 


1 
2. 


. Derivation of (3) from (2). Verify the steps. 


. Second proof. Give the details of the steps given on 
p. A93 of the book. What is the point of that proof? 


3-5| APPLICATION OF THEOREM 1 


3 


NI 


10 


. Find the potential ® in the region R in the first quadrant 
of the z-plane bounded by the axes (having potential 
U,) and the hyperbola y = 1/x (having potential U,) 
by mapping R onto a suitable infinite strip. Show that 
®@ is harmonic. What are its boundary values? 

Let ®* =4w, w=f(2) =e, and D:x<0, 

0 < y <7. Find ®. What are its boundary values? 

. CAS PROJECT. Graphing Potential Fields. 
Graph equipotential lines (a) in Example | of the text, 
(b) if the complex potential is F(z) = 22, iz, e. 
(c) Graph the equipotential surfaces for F(z) = Ln z as 
cylinders in space. 

. Apply Theorem 1 to ®*(u,v) =u? — v7, w= 
f(2) = e*, and any domain D, showing that the resulting 
potential ® is harmonic. 

. Rectangle, sinz. LetD:0 =x 5 37, QS y 31; D* 
the image of D under w = sin z; and ®* = u2 — v?. 
What is the corresponding potential ® in D? What are 
its boundary values? Sketch D and D*. 


. Conjugate potential. What happens in Prob. 7 if you 
replace the potential by its conjugate harmonic? 


. Translation. What happens in Prob. 7 if we replace 
sin z by cos z = sin (z + aq)? 

. Noncoaxial Cylinders. Find the potential between 
the cylinders Cy: lz] = 1 (potential U, = 0) and 
Co: |z — cl =c (potential U, = 220V), where 
0<c< 3. Sketch or graph equipotential lines and 
their orthogonal trajectories for c = z. Can you guess 
how the graph changes if you increase c (< 4)? 


11. 
12. 


13. 


14. 


15. 


16. 
17. 


On Example 2. Verify the calculations. 


Show that in Example 2 the y-axis is mapped onto the 
unit circle in the w-plane. 


Atz = +1 in Fig. 405 the tangents to the equipotential 
lines as shown make equal angles. Why? 


Figure 405 gives the impression that the potential on 
the y-axis changes more rapidly near 0 than near +i. 
Can you verify this? 

Angular region. By applying a suitable conformal 
mapping, obtain from Fig. 406 the potential ® in the 
sector —{7 < Argz< 47a such that ® = —3kV if 
Arg z = —g7 and ® = 3kV if Arg z = 477. 

Solve Prob. 15 if the sector i -ha <Argz< An. 


Another extension of Example 2. Find the linear 
fractional transformation z = g(Z) that maps |Z| = 1 
onto |z| S 1 with Z = i/2 being mapped onto z = 0. 
Show that Z; = 0.6 + 0.87 is mapped onto z = —1 
and Zp = —0.6+ 0.8i onto z= 1, so that the 
equipotential lines of Example 2 look in |Z| S 1 as 
shown in Fig. 407. 


Problem 17 


Fig. 407. 


SEC. 18.3. Heat Problems 767 


18. The equipotential lines in Prob. 17 are circles. Why? 20. Jumps. Do the same task as in Prob. 19 if the boundary 
19. Jump on the boundary. Find the complex and real values on the x-axis are Vo when —a < x <a and 0 
potentials in the upper half-plane with boundary values elsewhere. 
5 kV if x < 2 and 0 if x > 2 on the x-axis. 


18.3 Heat Problems 


EXAMPLE 1 


EXAMPLE 2 


Heat conduction in a body of homogeneous material is modeled by the heat equation 
LV T 


where the function T is temperature, 7, = 07/0t, t is time, and Cisa positive constant 
(specific to the material of the body; see Sec. 12.6). 

Now if a heat flow problem is steady, that is, independent of time, we have 7, = 0. If 
it is also two-dimensional, then the heat equation reduces to 


(1) WT = Lot Tey =O 


which is the two-dimensional Laplace equation. Thus we have shown that we can model 
a two-dimensional steady heat flow problem by Laplace’s equation. 

Furthermore we can treat this heat flow problem by methods of complex analysis, since 
T (or T(x, y)) is the real part of the complex heat potential 


Fiz) = TG. y) + eG, y). 


We call T(x, y) the heat potential. The curves T(x, y) = const are called isotherms, which 
means lines of constant temperature. The curves V(x, y) = const are called heat flow 
lines because heat flows along them from higher temperatures to lower temperatures. 

It follows that all the examples considered so far (Secs. 18.1, 18.2) can now be 
reinterpreted as problems on heat flow. The electrostatic equipotential lines B(x, y) = const 
now become isotherms T(x, y) = const, and the lines of electrical force become lines of 
heat flow, as in the following two problems. 


Temperature Between Parallel Plates 


Find the temperature between two parallel plates x = 0 and x = d in Fig. 408 having temperatures 0 and 100°C, 
respectively. 


Solution. As in Example | of Sec. 18.1 we conclude that T(x, y) = ax + b. From the boundary conditions, 
b = 0 and a = 100/d. The answer is 


T(x, y) a [°C] 
x,y) =——x 
7 d 


The corresponding complex potential is F(z) = (100/d)z. Heat flows horizontally in the negative x-direction 
along the lines y = const. =] 


Temperature Distribution Between a Wire and a Cylinder 


Find the temperature field around a long thin wire of radius r; = | mm that is electrically heated to 7, = 500°F 
and is surrounded by a circular cylinder of radius rg = 100 mm, which is kept at temperature Z = 60°F by 
cooling it with air. See Fig. 409. (The wire is at the origin of the coordinate system.) 
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Solution. T depends only on r, for reasons of symmetry. Hence, as in Sec. 18.1 (Example 2), 
T(x,y)=alnr+ b. 
The boundary conditions are 
T, = 500 = aln1 +b, TP, = 60 = aln 100 + b. 
Hence b = 500 (since In 1 = 0) and a = (60 — b)/In 100 = —95.54. The answer is 
T(x, y) = 500 — 95.54 In r[°F]. 


The isotherms are concentric circles. Heat flows from the wire radially outward to the cylinder. Sketch T as a 


function of r. Does it look physically reasonable? i} 
y 
y 
Insulated 
“Po Te 
fo) 
S 
“| Tl 
& 
Pale meee x 
eS 
© (aie — 
i T=50°C 1 x 
Sb -}--4- 0 
Fig. 408. Example 1 Fig. 409. Example 2 Fig. 410. Example 3 


Mathematically the calculations remain the same in the transition to another field of 
application. Physically, new problems may arise, with boundary conditions that would 
make no sense physically or would be of no practical interest. This is illustrated by the 
next two examples. 


A Mixed Boundary Value Problem 


Find the temperature distribution in the region in Fig. 410 (cross section of a solid quarter-cylinder), whose 
vertical portion of the boundary is at 20°C, the horizontal portion at 50°C, and the circular portion is insulated. 


Solution. The insulated portion of the boundary must be a heat flow line, since, by the insulation, heat is 
prevented from crossing such a curve, hence heat must flow along the curve. Thus the isotherms must meet 
such a curve at right angles. Since T is constant along an isotherm, this means that 


oT 
on 


(2) 


0 along an insulated portion of the boundary. 


Here dT/dn is the normal derivative of 7, that is, the directional derivative (Sec. 9.7) in the direction normal 
(perpendicular) to the insulated boundary. Such a problem in which 7 is prescribed on one portion of the boundary 
and d7/dn on the other portion is called a mixed boundary value problem. 

In our case, the normal direction to the insulated circular boundary curve is the radial direction toward the 
origin. Hence (2) becomes 07/dr = 0, meaning that along this curve the solution must not depend on r. Now 
Arg z = 6 satisfies (1), as well as this condition, and is constant (0 and 77/2) on the straight portions of the 
boundary. Hence the solution is of the form 


T(x, y) = a0 +b. 
The boundary conditions yield a - 7/2 + b = 20 anda: 0 + b = 50. This gives 


60 y. 
T(x, y) = 50 — — 6, 6 = arctan —. 
7 x 
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The isotherms are portions of rays 6 = const. Heat flows from the x-axis along circles r = const (dashed in 
Fig. 410) to the y-axis. [a] 


Tq 
5 Insulated i 


T=0°C -1 “~jnsulated 1 T= 20°C 


Fig. 411. Example 4: z-plane Fig. 412. Example 4: w-plane 


EXAMPLE 4 _ Another Mixed Boundary Value Problem in Heat Conduction 


Find the temperature field in the upper half-plane when the x-axis is kept at T = 0°C for x < —1, is insulated 
for —1 <x < 1, and is kept at T = 20°C for x > 1 (Fig. 411). 


Solution. We map the half-plane in Fig. 411 onto the vertical strip in Fig. 412, find the temperature 7* (u, v) 
there, and map it back to get the temperature T(x, y) in the half-plane. 

The idea of using that strip is suggested by Fig. 391 in Sec. 17.4 with the roles of z = x + iy andw = u + iv 
interchanged. The figure shows that z = sin w maps our present strip onto our half-plane in Fig. 411. Hence the 
inverse function 


w = f(z) = arcsin z 


maps that half-plane onto the strip in the w-plane. This is the mapping function that we need according to 
Theorem | in Sec. 18.2. 

The insulated segment —1 < x < 1 on the x-axis maps onto the segment —7/2 < u < 77/2 on the u-axis. 
The rest of the x-axis maps onto the two vertical boundary portions u = —7/2 and 7/2,v > 0, of the strip. 
This gives the transformed boundary conditions in Fig. 412 for T* (u,v), where on the insulated horizontal 
boundary, dT*/dn = dT*/dv = 0 because v is a coordinate normal to that segment. 

Similarly to Example 1 we obtain 


rm = 20 
T*(u,v) =10+—u 
7 


which satisfies all the boundary conditions. This is the real part of the complex potential F*(w) = 10 + (20/7) w. 
Hence the complex potential in the z-plane is 


20 . 
F(z) = F*(f(2)) = 10 + Fp aresin z 
and T(x, y) = Re F(z) is the solution. The isotherms are vu = const in the strip and the hyperbolas in the z-plane, 
perpendicular to which heat flows along the dashed ellipses from the 20°-portion to the cooler 0°-portion of the 


boundary, a physically very reasonable result. B 


Sections 18.3 and 18.5 show some of the usefulness of conformal mappings and complex 
potentials. Furthermore, complex potential models fluid flow in Sec. 18.4. 


PROBLEM SET 18-3 


1. Parallel plates. Find the temperature between the 2. Infinite plate. Find the temperature and the complex 
plates y = 0 and y = d kept at 20 and 100°C, respec- potential in an infinite plate with edges y = x — 4 and 
tively. (i) Proceed directly. (ii) Use Example 1 and a y =x + 4kept at —20 and 40°C, respectively (Fig. 413). 


suitable mapping. In what case will this be an approximate model? 
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Fig. 413. Problem 2: Infinite plate 


3. CAS PROJECT. Isotherms. Graph isotherms and 
lines of heat flow in Examples 2—4. Can you see from 
the graphs where the heat flow is very rapid? 


4-18] TEMPERATURE T (x, y) IN PLATES 


Find the temperature distribution T(x, y) and the complex 
potential F(z) in the given thin metal plate whose faces 
are insulated and whose edges are kept at the indicated 
temperatures or are insulated as shown. 


4. 


T = 200°C * 


10. 


11. 


12. 


16. 


17. 


18. 
19. 


20. 


T=0 7=100C 7T=0 * 


T =0°C ~ 


Hint. Apply w = cosh z to Prob. 11. 


14. , 
oO Insulated 
ie) 
co) 
N 
Hl 
& 
T=0°C a 
-20 
Insulated 


T = 20°C ss 


First quadrant of the z-plane with y-axis kept at 100°C, 
the segment 0 < x < 1 of the x-axis insulated and the 
x-axis for x > 1 kept at 200°C. Hint. Use Example 4. 


Figure 410, T(0, y) = —30°C, T(x, 0) = 100°C 


Interpretation. Formulate Prob. 11 in terms of electro- 
statics. 


Interpretation. Interpret Prob. 17 in Sec. 18.2 as a heat 
problem, with boundary temperatures, say, 10°C on the 
upper part and 200°C on the lower. 
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18.4 Fluid Flow 


Laplace’s equation also plays a basic role in hydrodynamics, in steady nonviscous fluid 
flow under physical conditions discussed later in this section. For methods of complex 
analysis to be applicable, our problems will be two-dimensional, so that the velocity vector 
V by which the motion of the fluid can be given depends only on two space variables x 
and y, and the motion is the same in all planes parallel to the xy-plane. 

Then we can use for the velocity vector V a complex function 


(1) V=\Ytilb 

giving the magnitude |V| and direction Arg V of the velocity at each point z = x + iy. 
Here V; and V4 are the components of the velocity in the x and y directions. V is tangential 
to the path of the moving particles, called a streamline of the motion (Fig. 414). 


We show that under suitable assumptions (explained in detail following the examples), 
for a given flow there exists an analytic function 


(2) F(Z) = OG, y) + iVG, y), 


called the complex potential of the flow, such that the streamlines are given by 
W(x, y) = const, and the velocity vector or, briefly, the velocity is given by 


(3) V=Ytih=F( 


Fig. 414. Velocity 


where the bar denotes the complex conjugate. V is called the stream function. The 
function ® is called the velocity potential. The curves ®(x, y) = const are called 
equipotential lines. The velocity vector V is the gradient of ©; by definition, this 
means that 


o® o® 
4 Y=—, we. 
(4) 1 2 = ay 


Indeed, for F = ® + iW, Eq. (4) in Sec. 13.4 is F’ = ©, + iV, with V, = —@,, by 
the second Cauchy—Riemann equation. Together we obtain (3): 


F'@) = ®, — iV, = ©, + id, =, + i = V. 
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Furthermore, since F(z) is analytic, ® and W satisfy Laplace’s equation 


2 2 2 2 
(5) vp =29 FP 9 yay =P OY 
ax? ay” ax? ay? 


0. 


Whereas in electrostatics the boundaries (conducting plates) are equipotential lines, in 
fluid flow the boundaries across which fluid cannot flow must be streamlines. Hence in 
fluid flow the stream function is of particular importance. 

Before discussing the conditions for the validity of the statements involving (2)—(5), let 
us consider two flows of practical interest, so that we first see what is going on from a 
practical point of view. Further flows are included in the problem set. 


Flow Around a Corner 
The complex potential F(z) = a y? + 2ixy models a flow with 


Equipotential lines @& = x2 = y2 = const (Hyperbolas) 


Streamlines W = 2xy = const (Hyperbolas). 


From (3) we obtain the velocity vector 


V=2z=2 — iy), that is, VY, = 2x, We = —2y. 


The speed (magnitude of the velocity) is 


lV| Vvi + v2 IV x2 4 y?. 


The flow may be interpreted as the flow in a channel bounded by the positive coordinates axes and a hyperbola, 
say, xy = | (Fig. 415). We note that the speed along a streamline S has a minimum at the point P where the 
cross section of the channel is large. 1] 


(0) x 


Fig. 415. Flow around a corner (Example 1) 


Flow Around a Cylinder 


Consider the complex potential 


1 
F(z) = ®(, y) + i¥@, y) = z4 - 


Using the polar form z = re”, we obtain 


F ls 1 1 
F(z) = re’? 4 se (- + +) cos + i(r *) sin 


Hence the streamlines are 


1 
V(x, y) = (- = 1) sin 8 = const. 
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In particular, V(x, y) = 0 gives r — 1/r = 0 or sin @ = 0. Hence this streamline consists of the unit circle 
(r = 1/r gives r = 1) and the x-axis (9 = 0 and @ = 77). For large |z| the term 1/z in F(z) is small in absolute 
value, so that for these z the flow is nearly uniform and parallel to the x-axis. Hence we can interpret this as a 
flow around a long circular cylinder of unit radius that is perpendicular to the z-plane and intersects it in the 
unit circle |z| = 1 and whose axis corresponds to z = 0. 

The flow has two stagnation points (that is, points at which the velocity V is zero), at z = +1. This follows 
from (3) and 


1 
F@=1-—>, hence 2-1=0. (See Fig. 416.) Mf 


y 


Fig. 416. Flow around a cylinder (Example 2) 


Assumptions and Theory Underlying (2)—(5) 


Complex Potential of a Flow 


If the domain of flow is simply connected and the flow is irrotational and 
incompressible, then the statements involving (2)-(5) hold. In particular, then the 
flow has a complex potential F(z), which is an analytic function. (Explanation of 
terms below.) 


We prove this theorem, along with a discussion of basic concepts related to fluid flow. 


(a) First Assumption: Irrotational. Let C be any smooth curve in the z-plane given 
by z(s) = x(s) + iy(s), where s is the arc length of C. Let the real variable VY. be the 
component of the velocity V tangent to C (Fig. 417). Then the value of the real line 
integral 


(6) | Vv; ds 


Cc 


Fig. 417. Tangential component of the 
velocity with respect to a curve C 
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taken along C in the sense of increasing s is called the circulation of the fluid along C, 
a name that will be motivated as we proceed in this proof. Dividing the circulation by the 
length of C, we obtain the mean velocity’ of the flow along the curve C. Now 


VY, = |V| cos a (Fig. 417). 


Hence ¥, is the dot product (Sec. 9.2) of V and the tangent vector dz/ds of C (Sec. 17.1); 
thus in (6), 


dx dy 
Yds =| V, + Vi 
a (us 2 ds 


)as= VY, dx + VW dy. 
The circulation (6) along C now becomes 


(7) | Yds = [ow dx + Vo dy). 
Cc Cc 


As the next idea, let C be a closed curve satisfying the assumption as in Green’s theorem 
(Sec. 10.4), and let C be the boundary of a simply connected domain D. Suppose further 
that V has continuous partial derivatives in a domain containing D and C. Then we can 
use Green’s theorem to represent the circulation around C by a double integral, 


vy OY 
(8) bia + wan = |{( )deay 
D 


e Ox oy 


The integrand of this double integral is called the vorticity of the flow. The vorticity 
divided by 2 is called the rotation 


1 (0VW OV, 
(9) w(x, 9) = 5 (= = =) 


We assume the flow to be irrotational, that is, w(x, y) = 0 throughout the flow; thus, 


Ve AV, 
(10) — == =i!) 
Ox oy 


To understand the physical meaning of vorticity and rotation, take for C in (8) a circle. 
Let r be the radius of C. Then the circulation divided by the length 27rr of C is the mean 


b 
1 
Definitions: ; | f(x) dx = mean value of f on the interval a = x Sb, 
>a 
a 


1 
z= | f(s) ds = mean value of fon C (L = length of C), 
c 


1 
A | | F(x, y) dx dy = mean value of f on D (A = area of D). 
D 


SEC. 18.4 Fluid Flow 775 


velocity of the fluid along C. Hence by dividing this by r we obtain the mean angular 
velocity wo of the fluid about the center of the circle: 


1 ave 4) l jI 

= a | 

00 Or? | | ( ax ay x dy eee w(x, y) dx dy 
p D 


If we now let r—0, the limit of wo is the value of w at the center of C. Hence w(x, y) is 
the limiting angular velocity of a circular element of the fluid as the circle shrinks to the 
point (x, y). Roughly speaking, if.a spherical element of the fluid were suddenly solidified 
and the surrounding fluid simultaneously annihilated, the element would rotate with the 
angular velocity w. 

(b) Second Assumption: Incompressible. Our second assumption is that the fluid is 
incompressible. (Fluids include liquids, which are incompressible, and gases, such as air, 
which are compressible.) Then 


Chen) 
(11) Sto 
Ox oy 


in every region that is free of sources or sinks, that is, points at which fluid is produced 
or disappears, respectively. The expression in (11) is called the divergence of V and is 
denoted by div V. (See also (7) in Sec. 9.8.) 

(c) Complex Velocity Potential. If the domain D of the flow is simply connected 
(Sec. 14.2) and the flow is irrotational, then (10) implies that the line integral (7) is 
independent of path in D (by Theorem 3 in Sec. 10.2, where Fy = VW, Fo = Vo, F3 = 0, 
and z is the third coordinate in space and has nothing to do with our present z). Hence if 
we integrate from a fixed point (a, b) in D to a variable point (x, y) in D, the integral 
becomes a function of the point (x, y), say, B(x, y): 


(x,y) 
(12) D(x, y) = | (Yi dx + Vody). 
(a,b) 


We claim that the flow has a velocity potential ®, which is given by (12). To prove 
this, all we have to do is to show that (4) holds. Now since the integral (7) is 
independent of path, 4, dx + Vo dy is exact (Sec. 10.2), namely, the differential of ®, 
that is, 


a® a® 
Yi dx + Vody = — dx + —~dy. 
Ox oy 


From this we see that V, = 0®/ax and % = d®/dy, which gives (4). 

That ® is harmonic follows at once by substituting (4) into (11), which gives the first 
Laplace equation in (5). 

We finally take a harmonic conjugate WY of ®. Then the other equation in (5) holds. 
Also, since the second partial derivatives of ® and V are continuous, we see that the 
complex function 


Py = Oy, y) + IVR) 
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is analytic in D. Since the curves V(x, y) = const are perpendicular to the equipotential 
curves P(x, y) = const (except where F ‘(z) = 0), we conclude that V(x, y) = const are 
the streamlines. Hence W is the stream function and F(z) is the complex potential of the 
flow. This completes the proof of Theorem | as well as our discussion of the important 
role of complex analysis in compressible fluid flow. a 


PROBLEM SET 18.4 


10. 


11. 


12. 


13. 


14. 


15. 


16. 


17. 


. Differentiability. Under what condition on the velocity 


vector V in (1) will F(z) in (2) be analytic? 


. Corner flow. Along what curves will the speed in 


Example | be constant? Is this obvious from Fig. 415? 


. Cylinder. Guess from physics and from Fig. 416 where 


on the y-axis the speed is maximum. Then calculate. 


. Cylinder. Calculate the speed along the cylinder wall 


in Fig. 416, also confirming the answer to Prob. 3. 


. Irrotational flow. Show that the flow in Example 2 is 


irrotational. 


. Extension of Example 1. Sketch or graph and 


interpret the flow in Example 1 on the whole upper 
half-plane. 


. Parallel flow. Sketch and interpret the flow with 


complex potential F(z) = z. 


. Parallel flow. What is the complex potential of an 


upward parallel flow of speed K > 0 in the direction 
of y = x? Sketch the flow. 


. Corner. What F(z) would be suitable in Example 1 if 


the angle of the corner were 7r/4 instead of 7/2? 
Corner. Show that F(z) = iz? also models a flow 
around a corner. Sketch streamlines and equipotential 
lines. Find V. 

What flow do you obtain from F(z) = —iKz, K positive 
real? 

Conformal mapping. Obtain the flow in Example 1 
from that in Prob. 11 by a suitable conformal mapping. 
60°- Sector. What F(z) would be suitable in Example 1 
if the angle at the corner were 77/3? 

Sketch or graph streamlines and equipotential lines 
of F(z) = iz®. Find V. Find all points at which V is 
horizontal. 

Change F(z) in Example 2 slightly to obtain a flow 
around a cylinder of radius ro that gives the flow in 
Example 2 if r9 > 1. 

Cylinder. What happens in Example 2 if you replace 
z by 27? Sketch and interpret the resulting flow in the 
first quadrant. 

Elliptic cylinder. Show that F(z) = arccos z gives 
confocal ellipses as streamlines, with foci at z = +1, 


18. 


19. 


20. 


and that the flow circulates around an elliptic cylinder 
or a plate (the segment from —1 to | in Fig. 418). 


Fig. 418. Flow around a plate in Prob. 17. 
Aperture. Show that F(z) = arccosh z gives confocal 
hyperbolas as streamlines, with foci at z = £1, and the 
flow may be interpreted as a flow through an aperture 
(Fig. 419). 


Fig. 419. Flow through an aperture in Prob. 18. 
Potential F(z) = 1/z. Show that the streamlines of 
F(z) = 1/z and circles through the origin with centers 
on the y-axis. 


TEAM PROJECT. Role of the Natural Logarithm 
in Modeling Flows. (a) Basic flows: Source and sink. 
Show that F(z) = (c/27r) In z with constant positive 
real c gives a flow directed radially outward (Fig. 420), 
so that F models a point source at z = 0 (that is, a 
source line x = 0, y = 0 in space) at which fluid is 
produced. c is called the strength or discharge of the 
source. If c is negative real, show that the flow is 
directed radially inward, so that F models a sink at 
z= 0, a point at which fluid disappears. Note that 
z = 0 is the singular point of F(z). 
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Fig. 420. Point source 


(b) Basic flows: Vortex. Show that F(z) = —(Ki/277) 
In z with positive real K gives a flow circulating coun- 
terclockwise around z = 0 (Fig. 421). z = 0 is called a 
vortex. Note that each time we travel around the vortex, 


the potential increases by K. 


(c) Addition of flows. Show that addition of the 
velocity vectors of two flows gives a flow whose 
complex potential is obtained by adding the complex 


potentials of those flows. 


Fig. 421. Vortex flow 


(d) Source and sink combined. Find the complex 
potentials of a flow with a source of strength 1 atz = —a 
and of a flow with a sink of strength 1 at z = a. Add 
both and sketch or graph the streamlines. Show that for 
small |a| these lines look similar to those in Prob. 19. 

(e) Flow with circulation around a cylinder. Add 
the potential in (b) to that in Example 2. Show that this 
gives a flow for which the cylinder wall |z| = 1 is a 
streamline. Find the speed and show that the stagnation 


points are 
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if K = 0 they are at +1; as K increases they move up 
on the unit circle until they unite at z = i (K = 477, see 
Fig. 422), and if K > 477 they lie on the imaginary axis 
(one lies in the field of flow and the other one lies inside 
the cylinder and has no physical meaning). 


Fig. 422. Flow around a cylinder without circulation 
(K = 0) and with circulation 


18.5 Poisson's Integral Formula for Potentials 


So far in this chapter we have seen powerful methods based on conformal mappings and 
complex potentials. They were used for modeling and solving two-dimensional potential 
problems and demonstrated the importance of complex analysis. 

Now we introduce a further method that results from complex integration. It will yield 
the very important Poisson integral formula (5) for potentials in a standard domain 
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(a circular disk). In addition, from (5), we will derive a useful series (7) for these potentials. 
This allows us to solve problems for disks and then map solutions conformally onto other 
domains. 


Derivation of Poisson’s Integral Formula 


Poisson’s formula will follow from Cauchy’s integral formula (Sec. 14.3) 


1 F(z* 
(1) F(Z) == ; FE) pe 
277i GHz 
C 
Here Cis the circle z* = Re’ (counterclockwise, 0 = a = 277), and we assume that F(z*) 
is analytic in a domain containing C and its full interior. Since dz* = iRe’“ da = iz* da, 
we obtain from (1) 


20 


1 
(2) F(z) = | F(z*) 


da (z* = Re, z = re"), 


ZZ 


So 


Now comes a little trick. If instead of z inside C we take a Z outside C, the integrals (1) 
and (2) are zero by Cauchy’s integral theorem (Sec. 14.2). We choose Z = z*z*/z = Ry re 
which is outside C because |Z| = R?/|z| = Rip > R. From (2) we thus have 


Qa 20 
1 z* 1 Ze 
O=—] Ret) ——da=—] Fe) —*— 
ral Crag sal on page 

ge 


Zz 
and by straightforward simplification of the last expression on the right, 


i ie zZ 
0 = —— | Fe") =—- = da; 
27 ' ca ci 


We subtract this from (2) and use the following formula that you can verify by direct 
calculation (zz* cancels): 


ze Z iy Aer a 


(3) 


ze— Zz Z—- Ze (g* — ZZ —-Z 


We then have 


27 = = 
a. a zee — ZZ 
(4) F(z) = al B(z*) (z* — z(Z* — 2) o 


From the polar representations of z and z* we see that the quotient in the integrand is real 
and equal to 


R2— 72 R2 — 2 


(Re’* — re’)(Re~ — re) R2 — 2Rrcos (9 — a) + r2 
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We now write F(z) = B(r, 0) + iV (7, 6) and take the real part on both sides of (4). Then 
we obtain Poisson’s integral formula” 


R2 — p2 
x da 
— 2Rrcos(9-— a) +r 


2a 
1 
(5) OGo)— =| D(R, a) re 
0 


This formula represents the harmonic function ® in the disk |z| S R in terms of its values 
@(R, a) on the boundary (the circle) lz] = R. 

Formula (5) is still valid if the boundary function ®(R,a@) is merely piecewise 
continuous (as is practically often the case; see Figs. 405 and 406 in Sec. 18.2 for an 
example). Then (5) gives a function harmonic in the open disk, and on the circle |z] = R 
equal to the given boundary function, except at points where the latter is discontinuous. 
A proof can be found in Ref. [D1] in App. 1. 


Series for Potentials in Disks 


From (5) we may obtain an important series development of ® in terms of simple harmonic 
functions. We remember that the quotient in the integrand of (5) was derived from (3). 
We claim that the right side of (3) is the real part of 


ghar ge. (CE Zee = Zz) Zh ez = Zz ee 
ze— Zz (Z* — zV(z* — Zz) [z* - 2|? 


Indeed, the last denominator is real and so is z*z* — zz in the numerator, whereas 
—z*z + zz* = 27 Im (zz*) in the numerator is pure imaginary. This verifies our claim. 
Now by the use of the geometric series we obtain (develop the denominator) 


ate Lt _/ <) > (<)'- (2) 
e e=2 1H) ae 2 ae 142d oe 


0 


Since z = re’ and z* = Re’, we have 


Z\" Ftp ai ry 
aM ina = = 
Re (3) | = Re Lr eve | = (<) cos (nO — na). 


On the right, cos (n@ — na) = cos nO cos na + sin né sinna. Hence from (6) we obtain 


Zeb Zz 


Ze % 


oo Zz n 
Re =1+2)> re(=) 
n=1 


(6*) 
} n 
=1+2 > (<) (cos n@ cos na + sin né@ sin na). 
n=1 


2SIMEON DENIS POISSON (1781-1840), French mathematician and physicist, professor in Paris from 1809. 
His work includes potential theory, partial differential equations (Poisson equation, Sec. 12.1), and probability 
(Sec. 24.7). 
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This expression is equal to the quotient in (5), as we have mentioned before, and 
by inserting it into (5) and integrating term by term with respect to a from 0 to 277 
we obtain 


i} n 
(7) (7,0) =a +> (<) (an, cos nO + by, sin nO) 


n=1 


where the coefficients are [the 2 in (6*) cancels the 2 in 1/(277) in (5)] 


1 27 I 27 
dg = == | D(R, a) da, a, = — | ®(R, a) cos na da, 
0 ir 0 
(8) oe 


27 
by = ral @(R, a) sin na da, 
0 


the Fourier coefficients of ®(R, a); see Sec. 11.1. Now, for r = R, the series (7) becomes 
the Fourier series of ®(R, a). Hence the representation (7) will be valid whenever the 
given @(R, a) on the boundary can be represented by a Fourier series. 


Dirichlet Problem for the Unit Disk 


Find the electrostatic potential @(r, 6) in the unit disk r < 1 having the boundary values 


—a/7T if -7T<a<0 
M1, a) = { (Fig. 423). 
a/™T if 0<a<T7 


Solution. Since ®(1, a) is even, b, = 0, and from (8) we obtain ag = 3 and 
0 7 
a a 2 
m==|-| © cosnada + | cosa de) = ay (C08 nar ~ 1. 
i now 


Hence, dy, = —4/(n? 1?) if n is odd, a, = 0 if n = 2,4,---, and the potential is 


Cen, ane ree bere eae e 
r,0) =———;|rcosé + cos + cos Brel s 
2 7 3? o 

Figure 424 shows the unit disk and some of the equipotential lines (curves ® = const). @ 

@(1, a) 
1 
| | 
—™ 0) nr oO 


Fig. 423. Boundary values in Example 1 Fig. 424. Potential in Example 1 
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PROBLEM SET 18-5 


1. 


Give the details of the derivation of the series (7) from 
the Poisson formula (5). 


. Verify (3). 


. Show that each term of (7) is a harmonic function in 
the disk r < R. 


. Why does the series in Example 1 reduce to a cosine 
series? 


5-18 


HARMONIC FUNCTIONS IN A DISK 


Using (7), find the potential ®(r, 6) in the unit disk r < 1 
having the given boundary values @(1, 0). Using the sum 


of 
of 


the first few terms of the series, compute some values 
® and sketch a figure of the equipotential lines. 


. O(1, 0) = 3 sin 30 


. P(1,0) = 5 — cos 20 

. ®(1, 0) = acos” 46 

. BL, 0) = 4 sin? 6 

. (1, 0) = 8 sin* @ 

. (1, 0) = 16 cos? 20 

. 01,0 =6/7 if-t<0<7 

- 01,0)=k if0 <6 <7 and 0 otherwise 

. 1,0) =6 if -da <6 < 47 and 0 otherwise 


. 01,0) = |6|/m7 if-T7<0<7 
. (1,0) =1 if da < 0 <4z7 and 0 otherwise 


16. 


17. 


18. 


19, 


20. 


647 if -7r<0<0 
wa.) ={ 

0-7 if 0<é0<7T7 
O(1,0) = 67/n if -t7<0<7 

0 if -7<0<0 
1.0) ={ 

Q if O0<0<7 


CAS EXPERIMENT. Series (7). Write a program for 
series developments (7). Experiment on accuracy by 
computing values from partial sums and comparing them 
with values that you obtain from your CAS graph. Do 
this (a) for Example | and Fig. 424, (b) for ® in Prob. 11 
(which is discontinuous on the boundary!), (c) for a ® 
of your choice with continuous boundary values, and 
(d) for ® with discontinuous boundary values. 


TEAM PROJECT. Potential in a Disk. (a) Mean 
value property. Show that the value of a harmonic 
function ® at the center of a circle C equals the mean 
of the value of ® on C (see Sec. 18.4, footnote 1, for 
definitions of mean values). 

(b) Separation of variables. Show that the terms of 
(7) appear as solutions in separating the Laplace 
equation in polar coordinates. 

(c) Harmonic conjugate. Find a series for a harmonic 
conjugate VW of ® from (7). Hint. Use the Cauchy— 
Riemann equations. 


(d) Power series. Find a series for F(z) = ® + iV. 


18.6 General Properties of Harmonic Functions. 
Uniqueness Theorem for the Dirichlet Problem 


Recall from Sec. 10.8 that harmonic functions are solutions to Laplace’s equation and 
their second-order partial derivatives are continuous. In this section we explore how 
general properties of harmonic functions often can be obtained from properties of analytic 
functions. This can frequently be done in a simple fashion. Specifically, important mean 
value properties of harmonic functions follow readily from those of analytic functions. 


The details are as follows. 


THEOREM -1 


Mean Value Property of Analytic Functions 


Let F(z) be analytic in a simply connected domain D. Then the value of F(z) at a point 
Zo in D is equal to the mean value of F(z) on any circle in D with center at Zo. 
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In Cauchy’s integral formula (Sec. 14.3) 


ZL 


1 F 
(1) Fo) = 375 ; Ox 
Cc 


we choose for C the circle z = zo + re“ in D. Then z — Zo = re’, dz = ire’ da, and 
(1) becomes 


27 


(2) F(zo) = = | F(zo + re’) da. 
0 


The right side is the mean value of F on the circle (= value of the integral divided by the 
length 277 of the interval of integration). This proves the theorem. ey 


For harmonic functions, Theorem | implies 


Two Mean Value Properties of Harmonic Functions 


Let ®(x, y) be harmonic in a simply connected domain D. Then the value of P(x, y) 
at a point (XQ, Yo) in D is equal to the mean value of B(x, y) on any circle in D with 
center at (Xo, Yo). This value is also equal to the mean value of ®(x, y) on any 
circular disk in D with center (Xo, yo). [See footnote | in Sec. 18.4.] 


The first part of the theorem follows from (2) by taking the real parts on both sides, 


2a 
1 
D(xo, yo) = Re F(xo + iyo) = | D(xo + rcosa, yo + rsin a) da. 
0 


The second part of the theorem follows by integrating this formula over r from 0 to 7o (the 
radius of the disk) and dividing by r2/ 2s, 


To p27 

1 

(3) D(x, yo) = = | | D(xo + rcos a, yo + rsin a)r da dr. 
TO “9 Yo 


The right side is the indicated mean value (integral divided by the area of the region of 
integration). ia 


Returning to analytic functions, we state and prove another famous consequence of Cauchy’s 
integral formula. The proof is indirect and shows quite a nice idea of applying the ML- 
inequality. (A bounded region is a region that lies entirely in some circle about the origin.) 


Maximum Modulus Theorem for Analytic Functions 


Let F(z) be analytic and nonconstant in a domain containing a bounded region R 
and its boundary. Then the absolute value |F(z)| cannot have a maximum at an 
interior point of R. Consequently, the maximum of |F(z)| is taken on the boundary 
of R. If F(2) # 0 in R, the same is true with respect to the minimum of |F(z)|. 
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We assume that |F(z)| has a maximum at an interior point zo of R and show that this 
leads to a contradiction. Let |F(zo)| = M be this maximum. Since F(z) is not constant, 
| F(z)| is not constant, as follows from Example 3 in Sec. 13.4. Consequently, we can find 
a circle C of radius r with center at zo such that the interior of C is in R and |F(z)| is 
smaller than M at some point P of C. Since |F(z)| is continuous, it will be smaller than 
M on an arc C, of C that contains P (see Fig. 425), say, 


lIF(2a| SM—-—k (k>0) for all z on Cy. 
Let Cy have the length L;. Then the complementary arc C2 of C has the length 27rr — Ly. 


We now apply the ML-inequality (Sec. 14.1) to (1) and note that |z — zo| = r. We then 
obtain (using straightforward calculation in the second line of the formula) 


1 F(z) 1 F(z) 
M = |FGo)l Sa | zn a| + om | a 
Cy Cy 
1/(M-k 1 (M _ kL, 
= a4( z Jat ae(*)em Li) =M Darr <M 


that is, M < M, which is impossible. Hence our assumption is false and the first statement 
is proved. 

Next we prove the second statement. If F(z) # 0 in R, then 1/F(z) is analytic in R. 
From the statement already proved it follows that the maximum of 1/|F(z)| lies on the 
boundary of R. But this maximum corresponds to the minimum of | F(z)|. This completes 
the proof. a 


Fig. 425. Proof of Theorem 3 


This theorem has several fundamental consequences for harmonic functions, as follows. 


Harmonic Functions 
Let ®(x, y) be harmonic in a domain containing a simply connected bounded region 
R and its boundary curve C. Then: 

(1) (Maximum principle) /f ®(x, y) is not constant, it has neither a maximum 
nor a minimum in R. Consequently, the maximum and the minimum are taken on 
the boundary of R. 

(II) If ®(, y) is constant on C, then B(x, y) is a constant. 

(III) If h(x, y) is harmonic in R and on C and if h(x, y) = B(x, y) on C, then 
h(x, y) = B(x, y) everywhere in R. 
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(I) Let W(x, y) be a conjugate harmonic function of ®(x, y) in R. Then the complex 
function F(z) = ®(x, y) + iW(x, y) is analytic in R, and so is G(z) = e*® Its absolute 
value is 


IG(2)| = eke F@) — eh, y 


From Theorem 3 it follows that |G(z)| cannot have a maximum at an interior point of R. 
Since e® is a monotone increasing function of the real variable ®, the statement about 
the maximum of ® follows. From this, the statement about the minimum follows by 
replacing ® by —®. 

(II) By (J the function ®(x, y) takes its maximum and its minimum on C. Thus, if 
@(x, y) is constant on C, its minimum must equal its maximum, so that P(x, y) must be 
a constant. 

(III) If h and ® are harmonic in R and on C, then h — ® is also harmonic in R and 
on C, and by assumption,  — ® = 0 everywhere on C. By (II) we thus have h — ® = 0 
everywhere in R, and (III) is proved. fe 


The last statement of Theorem 4 is very important. It means that a harmonic function is 
uniquely determined in R by its values on the boundary of R. Usually, B(x, y) is required 
to be harmonic in R and continuous on the boundary of R, that is, 


lim ®(x, y) = B(xo, yo), where (xg, yo) is on the boundary and (x, y) is in R. 
L—>2X9 


¥y~Yo 


Under these assumptions the maximum principle (I) is still applicable. The problem of 
determining ®(x, y) when the boundary values are given is called the Dirichlet problem 
for the Laplace equation in two variables, as we know. From (III) we thus have, as a 
highlight of our discussion, 


Uniqueness Theorem for the Dirichlet Problem 


If for a given region and given boundary values the Dirichlet problem for the Laplace 
equation in two variables has a solution, the solution is unique. 


PROBLEM SET 18-6 


PROBLEMS RELATED TO THEOREMS 1 AND 2 7-9| Verify (3) in Theorem 2 for the given ®(,, y), 


circle of radius 1. 


Lo(zt+ 1)°: Zo = 


(xo, Yo), and circle of radius 1. 
7. («— I(y- 1), (2, —2) 
8. x2 — y?, (3,8) 

9 xtytxy, J) 


1-4| Verify Theorem | for the given F(z), zo, and 


4 
2. 22", oe “2 10. Verify the calculations involving the inequalities in the 
3. 3z — aa zo=4 proof of Theorem 3. 
4.(-)", z= 71 11. CAS EXPERIMENT. Graphing Potentials. Graph 
5. Integrate |z| around the unit circle. Does the result the potentials in Probs. 7 and 9 and for two other 
contradict Theorem 1? functions of your choice as surfaces over a rectangle 
6. Derive the first statement in Theorem 2 from Poisson’s in the xy-plane. Find the locations of the maxima and 


integral formula. 


minima by inspecting these graphs. 


Chapter 18 Review Questions and Problems 


12. 


TEAM PROJECT. Maximum Modulus of Analytic 
Functions. (a) Verify Theorem 3 for (i) F(z) = 2 and 
the rectangle 1 Sx 35,2 Sy =4, (ii) F() = sinz 
and the unit disk, and (iii) F(z) = e* and any bounded 
domain. 

(b) F(z) = 1 + |z| is not zero in the disk |z| S 2 and 
has a minimum at an interior point. Does this contradict 
Theorem 3? 

(c) F(x) = sinx (x real) has a maximum | at 77/2. 
Why can this not be a maximum of |F(z)| = |sin z| in 
a domain containing z = 77/2? 

(d) If F(z) is analytic and not constant in the closed 
unit disk D: |z| = 1 and |F(z)| = c = const on the unit 
circle, show that F(z) must have a zero in D. 


13-17 


MAXIMUM MODULUS 


Find the location and size of the maximum of |F(z)| in the 
unit disk |z| = 1. 


13. 


F(z) = cos z 


14. 
15. 
16. 
17. 
18. 


19. 


20. 


785 


F(z) = exp 2 

F(z) = sinh 2z 

F(z) = az + b (a, b complex, a # 0) 
F(z) = 222-2 


Verify the maximum principle for P(x, y) = e” sin y 
and the rectanglea Sx =b,0 Sy 327. 


Harmonic conjugate. Do ® and a harmonic conjugate 
W ina region R have their maximum at the same point 
of R? 


Conformal mapping. Find the location (w1, v1) of the 
maximum of ®* = e“cosv in R*:|w| S1,v 20, 
where w = u + iv. Find the region R that is mapped 
onto R* by w = f(z) = 22. Find the potential in R 
resulting from ®* and the location (x1, y,) of the 
maximum. Is (uw , Vz) the image of (x1, y;)? If so, is 
this just by chance? 


CHAPTER T8 REVIEW QUESTIONS AND PROBLEMS 


1. 


12. 


13. 


Why can potential problems be modeled and solved by 
methods of complex analysis? For what dimensions? 


. What parts of complex analysis are mainly of interest 


to the engineer and physicist? 


. What is a harmonic function? A harmonic conjugate? 


. What areas of physics did we consider? Could you 


think of others? 


. Give some examples of potential problems considered 


in this chapter. Make a list of corresponding functions. 


. What does the complex potential give physically? 


. Write a short essay on the various assumptions made 


in fluid flow in this chapter. 


. Explain the use of conformal mapping in potential 


theory. 


. State the maximum modulus theorem and mean value 


theorems for harmonic functions. 


. State Poisson’s integral formula. Derive it from Cauchy’s 


formula. 


. Find the potential and the complex potential between 


the plates y = x and y = x + 10 kept at 10 V and 110 V, 
respectively. 

Find the potential and complex potential between the 
coaxial cylinders of axis 0 (hence the vertical axis 
in space) and radii ry = 1 cm, rg = 10cm, kept at 
potential U, = 200 V and U; = 2 kV, respectively. 
Do the task in Prob. 12 if U; = 220 V and the outer 
cylinder is grounded, Uz = 0. 


14. 


15. 


16. 
17. 


18. 


19. 


20. 


21. 


22. 
23. 


24. 


25. 


If plates at x; = 1 and x2 = 10 are kept at potentials 
U, = 200 V, Up = 2 kV, is the potential at x = 5 
larger or smaller than the potential at r = 5 in Prob. 12? 
No calculation. Give reason. 
Make a list of important potential functions, with 
applications, from memory. 


Find the equipotential lines of F(z) = i Ln z. 


Find the potential in the first quadrant of the xy-plane if 
the x-axis has potential 2 kV and the y-axis is grounded. 
Find the potential in the angular region between the 
plates Arg z = 77/6 kept at 800 V and Arg z = 77/3 
kept at 600 V. 


Find the temperature T in the upper half-plane if, on 
the x-axis, T = 30°C for x > 1 and —30°C for x < 1. 


Interpret Prob. 18 as an electrostatic problem. What are 
the lines of electric force? 


Find the streamlines and the velocity for the complex 
potential F(z) = (1 + i)z. Describe the flow. 


Describe the streamlines for F(z) = 377 cae a 


Show that the isotherms of F(z) = ~iz? +z are 
hyperbolas. 
State the theorem on the behavior of harmonic 


functions under conformal mapping. Verify it for 
®* = e“sinv and w =u + iv = 22. 


Find V in Prob. 22 and verify that it gives vectors 
tangent to the streamlines. 
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SUMMARY-OF-CHAPTER-LO 


Complex Analysis and Potential Theory 


Potential theory is the theory of solutions of Laplace’s equation 
(1) Vo = 0. 


Solutions whose second partial derivatives are continuous are called harmonic 
functions. Equation (1) is the most important PDE in physics, where it is of interest 
in two and three dimensions. It appears in electrostatics (Sec. 18.1), steady-state 
heat problems (Sec. 18.3), fluid flow (Sec. 18.4), gravity, etc. Whereas the three- 
dimensional case requires other methods (see Chap. 12), two-dimensional potential 
theory can be handled by complex analysis, since the real and imaginary parts of 
an analytic function are harmonic (Sec. 13.4). They remain harmonic under 
conformal mapping (Sec. 18.2), so that conformal mapping becomes a powerful 
tool in solving boundary value problems for (1), as is illustrated in this chapter. 
With a real potential ® in (1) we can associate a complex potential 


(2) F(Z =O +iv (Sec. 18.1). 


Then both families of curves ® = const and VW = const have a physical meaning. 
In electrostatics, they are equipotential lines and lines of electrical force (Sec. 18.1). 
In heat problems, they are isotherms (curves of constant temperature) and lines of 
heat flow (Sec. 18.3). In fluid flow, they are equipotential lines of the velocity 
potential and streamlines (Sec. 18.4). 

For the disk, the solution of the Dirichlet problem is given by the Poisson formula 
(Sec. 18.5) or by a series that on the boundary circle becomes the Fourier series of 
the given boundary values (Sec. 18.5). 

Harmonic functions, like analytic functions, have a number of general properties; 
particularly important are the mean value property and the maximum modulus 
property (Sec. 18.6), which implies the uniqueness of the solution of the Dirichlet 
problem (Theorem 5 in Sec. 18.6). 


PART E 


~ Numeric 
- Analysis 


Software (p. 788-789) 
CHAPTER 19 
CHAPTER 20 
CHAPTER 21 


Numerics in General 
Numeric Linear Algebra 
Numerics for ODEs and PDEs 


Numeric analysis or briefly numerics continues to be one of the fastest growing areas 
of engineering mathematics. This is a natural trend with the ever greater availability of 
computing power and global Internet use. Indeed, good software implementation of 
numerical methods are readily available. Take a look at the updated list of Software 
starting on p. 788. It contains software for purchase (commercial software) and software 
for free download (public-domain software). For convenience, we provide Internet 
addresses and phone numbers. The software list includes computer algebra systems 
(CASs), such as Maple and Mathematica, along with the Maple Computer Guide, 10th 
ed., and Mathematica Computer Guide, 10th ed., by E. Kreyszig and E. J. Norminton 
related to this text that teach you stepwise how to use these computer algebra systems and 
with complete engineering examples drawn from the text. Furthermore, there is scientific 
software, such as IMSL, LAPACK (free download), and scientific calculators with graphic 
capabilities such as TI-Nspire. Note that, although we have listed frequently used quality 
software, this list is by no means complete. 


In your career as an engineer, appplied mathematician, or scientist you are likely to use 
commercially available software or proprietary software, owned by the company you work 
for, that uses numeric methods to solve engineering problems, such as modeling chemical or 
biological processes, planning ecologically sound heating systems, or computing trajectories 
of spacecraft or satellites. For example, one of the collaborators of this book (Herbert Kreyszig) 
used proprietary software to determine the value of bonds, which amounted to solving higher 
degree polynomial equations, using numeric methods discussed in Sec. 19.2. 
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However, the availability of quality software does not alleviate your effort and 
responsibility to first understand these numerical methods. Your effort will pay off 
because, with your mathematical expertise in numerics, you will be able to plan your 
solution approach, judiciously select and use the appropriate software, judge the quality 
of software, and, perhaps, even write your own numerics software. 


Numerics extends your ability to solve problems that are either difficult or impossible 
to solve analytically. For example, certain integrals such as error function [see App. 3, 
formula (35)] or large eigenvalue problems that generate high-degree characteristic 
polynomials cannot be solved analytically. Numerics is also used to construct approximating 
polynomials through data points that were obtained from some experiments. 


Part E is designed to give you a solid background in numerics. We present many numeric 
methods as algorithms, which give these methods in detailed steps suitable for software 
implementation on your computer, CAS, or programmable calculator. The first chapter, 
Chap. 19, covers three main areas. These are general numerics (floating point, rounding errors, 
etc.), solving equations of the form f(x) = O (using Newton’s method and other methods), 
interpolation along with methods of numeric integration that make use of it, and differentiation. 


Chapter 20 covers the essentials of numeric linear algebra. The chapter breaks into two 
parts: solving linear systems of equations by methods of Gauss, Doolittle, Cholesky, etc. 
and solving eigenvalue problems numerically. Chapter 21 again has two themes: solving 
ordinary differential equations and systems of ordinary differential equations as well as 
solving partial differential equations. 


Numerics is a very active area of research as new methods are invented, existing methods 
improved and adapted, and old methods—impractical in precomputer times—are 
rediscovered. A main goal in these activities is the development of well-structured 
software. And in large-scale work—millions of equations or steps of iterations—even 
small algorithmic improvements may have a large significant effect on computing time, 
storage demand, accuracy, and stability. 


Remark on Software Use. Part E is designed in such a way as to allow compelete flexibility 
on the use of CASs, software, or graphing calculators. The computational requirements 
range from very little use to heavy use. The choice of computer use is at the discretion 
of the professor. The material and problem sets (except where clearly indicated such as 
in CAS Projects, CAS Problems, or CAS Experiments, which can be omitted without loss 
of continuity) do not require the use of a CAS or software. A scientific calculator perhaps 
with graphing capabilities is all that is required. 


Software 


See also http://www.wiley.com/college/kreyszig/ 


The following list will help you if you wish to find software. You may also obtain information 
on known and new software from websites such as Dr. Dobb’s Portal, from articles published 
by the American Mathematical Society (see also its website at www.ams.org), the Society 
for Industrial and Applied Mathematics (SIAM, at www.siam.org), the Association for 
Computing Machinery (ACM, at www.acm.org), or the Institute of Electrical and Electronics 
Engineers (IEEE, at www.ieee.org). Consult also your library, computer science department, 
or mathematics department. 
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TI-Nspire. Includes TI-Nspire CAS and programmable graphic calculators. Texas Instru- 
ments, Inc., Dallas, TX. Telephone: 1-800-842-2737 or (972) 917-8324; website at 
www.education.ti.com. 


EISPACK. See LAPACK. 


GAMS (Guide to Available Mathematical Software). Website at http://gams.nist.gov. 
Online cross-index of software development by NIST. 


IMSL (International Mathematical and Statistical Library). Visual Numerics, Inc., 
Houston, TX. Telephone: 1-800-222-4675 or (713) 784-3131; website at www.vni.com. 
Mathematical and statistical FORTRAN routines with graphics. 


LAPACK. FORTRAN 77 routines for linear algebra. This software package supersedes 
LINPACK and EISPACK. You can download the routines from www.netlib.org/lapack. 
The LAPACK User’s Guide is available at www.netlib.org. 


LINPACK see LAPACK 


Maple. Waterloo Maple, Inc., Waterloo, ON, Canada. Telephone: 1-800-267-6583 or 
(519) 747-2373; website at www.maplesoft.com. 


Maple Computer Guide. For Advanced Engineering Mathematics, 10th edition. By 
E. Kreyszig and E. J. Norminton. John Wiley and Sons, Inc., Hoboken, NJ. Telephone: 
1-800-225-5945 or (201) 748-6000. 


Mathcad. Parametric Technology Corp. (PTC), Needham, MA. Website at www.ptc.com. 


Mathematica. Wolfram Research, Inc., Champaign, IL. Telephone: 1-800-965-3726 or 
(217) 398-0700; website at www.wolfram.com. 


Mathematica Computer Guide. For Advanced Engineering Mathematics, 10th edition. 
By E. Kreyszig and E. J. Norminton. John Wiley and Sons, Inc., Hoboken, NJ. Telephone: 
1-800-225-5945 or (201) 748-6000. 


Matlab. The MathWorks, Inc., Natick, MA. Telephone: (508) 647-7000; website at 
www.mathworks.com. 


NAG. Numerical Algorithms Group, Inc., Lisle, IL. Telephone: (630) 971-2337; website 
at www.nag.com. Numeric routines in FORTRAN 77, FORTRAN 90, and C. 


NETLIB. Extensive library of public-domain software. See at www.netlib.org. 


NIST. National Institute of Standards and Technology, Gaithersburg, MD. Telephone: 
(301) 975-6478; website at www.nist.gov. For Mathematical and Computational Science 
Division telephone: (301) 975-3800. See also http://math.nist.gov. 


Numerical Recipes. Cambridge University Press, New York, NY. Telephone: 1-800-221- 
4512 or (212) 924-3900; website at www.cambridge.org/us. Book, 3rd ed. (in C++) see 
App. 1, Ref. [E25]; source code on CD ROM in C+ +, which also contains old source code 
(but not text) for (out of print) 2nd ed. C, FORTRAN 77, FORTRAN 90 as well as source 
code for (out of print) Ist ed. To order, call office at West Nyack, NY, at 1-800-872-7423 
or (845) 353-7500 or online at www.nr.com. 


FURTHER SOFTWARE IN STATISTICS. See Part G. 
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Numerics in General 


Numeric analysis or briefly numerics has a distinct flavor that is different from basic 
calculus, from solving ODEs algebraically, or from other (nonnumeric) areas. Whereas in 
calculus and in ODEs there were very few choices on how to solve the problem and your 
answer was an algebraic answer, in numerics you have many more choices and your 
answers are given as tables of values (numbers) or graphs. You have to make judicous 
choices as to what numeric method or algorithm you want to use, how accurate you need 
your result to be, with what value (starting value) do you want to begin your computation, 
and others. This chapter is designed to provide a good transition from the algebraic type 
of mathematics to the numeric type of mathematics. 

We begin with the general concepts such as floating point, roundoff errors, and general 
numeric errors and their propagation. This is followed in Sec. 19.2 by the important topic 
of solving equations of the type f(x) = 0 by various numeric methods, including the famous 
Newton method. Section 19.3 introduces interpolation methods. These are methods that 
construct new (unknown) function values from known function values. The knowledge 
gained in Sec. 19.3 is applied to spline interpolation (Sec. 19.4) and is useful for under- 
standing numeric integration and differentiation covered in the last section. 

Numerics provides an invaluable extension to the knowledge base of the problem- 
solving engineer. Many problems have no solution formula (think of a complicated integral 
or a polynomial of high degree or the interpolation of values obtained by measurements). 
In other cases a complicated solution formula may exist but may be practically useless. 
It is for these kinds of problems that a numerical method may generate a good answer. 
Thus, it is very important that the applied mathematician, engineer, physicist, or scientist 
becomes familiar with the essentials of numerics and its ideas, such as estimation of errors, 
order of convergence, numerical methods expressed in algorithms, and is also informed 
about the important numeric methods. 


Prerequisite: Elementary calculus. 
References and Answers to Problems: App. 1 Part E, App. 2. 


19.1 Introduction 
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As an engineer or physicist you may deal with problems in elasticity and need to solve 
an equation such as x cosh x = | or a more difficult problem of finding the roots of a 
higher order polynomial. Or you encounter an integral such as 


1 
| exp (—x”) dx 
0 
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[see App. 3, formula (35)] that you cannot solve by elementary calculus. Such problems, 
which are difficult or impossible to solve algebraically, arise frequently in applications. 
They call for numeric methods, that is, systematic methods that are suitable for solving, 
numerically, the problems on computers or calculators. Such solutions result in tables of 
numbers, graphical representation (figures), or both. Typical numeric methods are iterative 
in nature and, for a well-choosen problem and a good starting value, will frequently 
converge to a desired answer. The evolution from a given problem that you observed in 
an experimental lab or in an industrial setting (in engineering, physics, biology, chemistry, 
economics, etc.) to an approximation suitable for numerics to a final answer usually 
requires the following steps. 


1. Modeling. We set up a mathematical model of our problem, such as an integral, a 
system of equations, or a differential equation. 


2. Choosing a numeric method and parameters (e.g., step size), perhaps with a 
preliminary error estimation. 


3. Programming. We use the algorithm to write a corresponding program in a CAS, 
such as Maple, Mathematica, Matlab, or Mathcad, or, say, in Java, C or Ct, or 
FORTRAN, selecting suitable routines from a software system as needed. 


4. Doing the computation. 


5. Interpreting the results in physical or other terms, also deciding to rerun if further 
results are needed. 


Steps | and 2 are related. A slight change of the model may often admit of a more efficient 
method. To choose methods, we must first get to know them. Chapters 19-21 contain efficient 
algorithms for the most important classes of problems occurring frequently in practice. 

In Step 3 the program consists of the given data and a sequence of instructions to be 
executed by the computer in a certain order for producing the answer in numeric or graphic 
form. 

To create a good understanding of the nature of numeric work, we continue in this 
section with some simple general remarks. 


Floating-Point Form of Numbers 


We know that in decimal notation, every real number is represented by a finite or an 
infinite sequence of decimal digits. Now most computers have two ways of representing 
numbers, called fixed point and floating point. In a fixed-point system all numbers are 
given with a fixed number of decimals after the decimal point; for example, numbers 
given with 3 decimals are 62.358, 0.014, 1.000. In a text we would write, say, 3 decimals 
as 3D. Fixed-point representations are impractical in most scientific computations because 
of their limited range (explain!) and will not concern us. 
In a floating-point system we write, for instance, 


0.6247 + 10°, 0.1735 - 10773, —0.2000 - 107+ 
or sometimes also 
6.247 + 107, 1735= 107™*: —2.000 + 1072. 


We see that in this system the number of significant digits is kept fixed, whereas the decimal 
point is “floating.” Here, a significant digit of a number c is any given digit of c, except 
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possibly for zeros to the left of the first nonzero digit; these zeros serve only to fix the 
position of the decimal point. (Thus any other zero is a significant digit of c.) For instance, 


13600, 1.3600, 0.0013600 


all have 5 significant digits. In a text we indicate, say, 5 significant digits, by 5S. 
The use of exponents permits us to represent very large and very small numbers. Indeed, 
theoretically any nonzero number a can be written as 


(1) a= +m- 10", 0.1 S |m| <1, n integer. 


On modern computers, which use binary (base 2) numbers, m is limited to k binary digits (e.g., 
k = 8) and nis limited (see below), giving representations (for finitely many numbers only!) 


(2) a=+m-2", m = 0.dyd2°-: dy, dy > 0. 


These numbers a are called k-digit binary machine numbers. Their fractional part m 
(or m) is called the mantissa. This is not identical with “mantissa” as used for logarithms. 
n is called the exponent of a. 

It is important to realize that there are only finitely many machine numbers and that 
they become less and less “dense” with increasing a. For instance, there are as many 
numbers between 2 and 4 as there are between 1024 and 2048. Why? 

The smallest positive machine number eps with | + eps > | is called the machine 
accuracy. It is important to realize that there are no numbers in the intervals [1, | + eps], 
[2,2 + 2+ eps],---, [1024, 1024 + 1024 - eps], ---. This means that, if the mathematical 
answer to a computation would be 1024 + 1024 - eps/2, the computer result will be either 
1024 or 1024 - eps so it is impossible to achieve greater accuracy. 


Underflow and Overflow. The range of exponents that a typical computer can handle 
is very large. The IEEE (Institute of Electrical and Electronic Engineers) floating-point 
standard for single precision is from 2~ 17° to 2178 (1.175 x 107°8 to 3.403 x 10°8) and 
for double precision it is from 271°?” to 21°74 (2.225 x 10738 to 1.798 x 10%), 

As a minor technicality, to avoid storing a minus in the exponent, the ranges are shifted 
from [—126, 128] by adding 126 (for double precision 1022). Note that shifted exponents 
of 255 and 1047 are used for some special cases such as representing infinity. 

If, in a computation a number outside that range occurs, this is called underflow when 
the number is smaller and overflow when it is larger. In the case of underflow, the result 
is usually set to zero and computation continues. Overflow might cause the computer to 
halt. Standard codes (by IMSL, NAG, etc.) are written to avoid overflow. Error messages 
on overflow may then indicate programming errors (incorrect input data, etc.). From here 
on, we will be discussing the decimal results that we obtain from our computations. 


Roundoff 


An error is caused by chopping (= discarding all digits from some decimal on) or rounding. 
This error is called roundoff error, regardless of whether we chop or round. The rule for 
rounding off a number to k decimals is as follows. (The rule for rounding off to k significant 
digits is the same, with “decimal” replaced by “significant digit.”’) 

Roundoff Rule. To round a number x to k decimals, and 5 - io 
digits after the (k + 1)st digit. 


to x and chop the 
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EXAMPLE 1 


EXAMPLE 2 


Roundoff Rule 
Round the number 1.23454621 to (a) 2 decimals, (b) 3 decimals, (c) 4 decimals, (d) 5 decimals, and (e) 6 decimals. 


Solution. (a) For 2 decimals we add 5- 107**P = 5. 1073 = 0.005 to the given number, that is, 
1.2345621 + 0.005 = 1.23 954621. Then we chop off the digits “954621” after the space or equivalently 
1.23954621 — 0.00954621 = 1.23. 

(b) 1.23454621 + 0.0005 = 1.235 04621, so that for 3 decimals we get 1.234. 

(c) 1.23459621 after chopping give us 1.2345 (4 decimals). 

(d) 1.23455121 yields 1.23455 (5 decimals). 

(e) 1.23454671 yields 1.234546 (6 decimals). 

Can you round the number to 7 decimals? | 


Chopping is not recommended because the corresponding error can be larger than that 
in rounding. (Nevertheless, some computers use it because it is simpler and faster. On the 
other hand, some computers and calculators improve accuracy of results by doing 
intermediate calculations using one or more extra digits, called guarding digits.) 


Error in Rounding. Let a = fl (a) in (2) be the floating-point computer approximation of 
a in (1) obtained by rounding, where fl suggests floating. Then the roundoff rule gives (by 
dropping exponents) |m — m| = 5 - 107”. Since |m| = 0.1, this implies (when a # 0) 


(3) 


1 1-k 
=—-10 : 
2 


The right side u = 3 + 10'~* is called the rounding unit. If we write @ = a(1 + 6), we 
have by algebra (@ — a)/a = 6, hence |6| S u by (3). This shows that the rounding unit 
u is an error bound in rounding. 

Rounding errors may ruin a computation completely, even a small computation. In 
general, these errors become the more dangerous the more arithmetic operations (perhaps 
several millions!) we have to perform. It is therefore important to analyze computational 
programs for expected rounding errors and to find an arrangement of the computations 
such that the effect of rounding errors is as small as possible. 

As mentioned, the arithmetic in a computer is not exact and causes further errors; 
however, these will not be relevant to our discussion. 


Accuracy in Tables. Although available software has rendered various tables of function 
values superfluous, some tables (of higher functions, of coefficients of integration 
formulas, etc.) will still remain in occasional use. If a table shows k significant digits, it 
is conventionally assumed that any value a in the table deviates from the exact value a 
by at most +h unit of the kth digit. 


Loss of Significant Digits 


This means that a result of a calculation has fewer correct digits than the numbers from 
which it was obtained. This happens if we subtract two numbers of about the same size, 
for example, 0.1439 — 0.1426 (“subtractive cancellation’). It may occur in simple 
problems, but it can be avoided in most cases by simple changes of the algorithm—if one 
is aware of it! Let us illustrate this with the following basic problem. 


Quadratic Equation. Loss of Significant Digits 
Find the roots of the equation 
x? + 40x + 2 =0, 


using 4 significant digits (abbreviated 4S) in the computation. 
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Solution. A formula for the roots x1, x2 of a quadratic equation ax? + bx +c = Ois 
1 1 

(4) m= > (-—b + Vb? = 4ac), ao (—b — Vb? = 4ac). 
a a 


Furthermore, since x4x2 = c/a, another formula for those roots 


Cc 


(5) xy = Xg as in (4). 


Gy’ 
We see that this avoids cancellation in x1 for positive b. 

If b < 0, calculate x, from (4) and then x2 = c/(ax}). 

For x? + 40x + 2 = Owe obtain from (4) x 20 + V398 = —20 + 19.95, hence xy = —20.00 — 19.95, 
involving no difficulty, and x, 20.00 + 19.95 0.05, a poor value involving loss of digits by subtractive 
cancellation. 

In contrast, (5) gives xy = 2.000/(—39.95) 0.05006, the absolute value of the error being less than one 
unit of the last digit, as a computation with more digits shows. The 10S-value is —0.05006265674. | 


Errors of Numeric Results 


Final results of computations of unknown quantities generally are approximations; that 
is, they are not exact but involve errors. Such an error may result from a combination 
of the following effects. Roundoff errors result from rounding, as discussed above. 
Experimental errors are errors of given data (probably arising from measurements). 
Truncating errors result from truncating (prematurely breaking off), for instance, if we 
replace a Taylor series with the sum of its first few terms. These errors depend on the 
computational method used and must be dealt with individually for each method. 
[“Truncating” is sometimes used as a term for chopping off (see before), a terminology 
that is not recommended. ] 


Formulas for Errors. If @ is an approximate value of a quantity whose exact value is 
a, we call the difference 


(6) €=a-Ga 


the error of ad. Hence 
(6*) a=ate, True value = Approximation + Error. 


For instance, if @ = 10.5 is an approximation of a = 10.2, its error is e = —0.3. The 
error of an approximation @ = 1.60 of a = 1.82 is € = 0.22. 


CAUTION! In the literature |a — G@| (“absolute error”) or @ — a are sometimes also 
used as definitions of error. 
The relative error €, of a is defined by 
a-a@a Error 


(7) =i = = (a # 0). 
a a True value 


This looks useless because a is unknown. But if |e| is much less than |@|, then we can 
use a instead of a and get 


(7') é,= =. 
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This still looks problematic because € is unknown—if it were known, we could get 
a = + e from (6) and we would be done. But what one often can obtain in practice is 
an error bound for a, that is, a number B such that 


le| = 2; hence la-a| SB. 


This tells us how far away from our computed a the unknown a can at most lie. Similarly, 
for the relative error, an error bound is a number 8, such that 


a-a@a 
a 


le,| = B,, hence 


= By. 


Error Propagation 


This is an important matter. It refers to how errors at the beginning and in later steps 
(roundoff, for example) propagate into the computation and affect accuracy, sometimes 
very drastically. We state here what happens to error bounds. Namely, bounds for the 
error add under addition and subtraction, whereas bounds for the relative error add under 
multiplication and division. You do well to keep this in mind. 


Error Propagation 


(a) In addition and subtraction, a bound for the error of the results is given by 
the sum of the error bounds for the terms. 


(b) In multiplication and division, an error bound for the relative error of the 
results is given (approximately) by the sum of the bounds for the relative errors 
of the given numbers. 


Eyl = B,. Then for the 


(a) We use the notations x = X + €;,y = y + €y, é,|'°= Bus 


error € of the difference we obtain 
lel =|x-y-@-V)| 
=|x-¥-(-F)I 
= le, _ Eyl Ss le + ley| = By, + By. 


The proof for the sum is similar and is left to the student. 


(b) For the relative error €, of * we get from the relative errors €,., and €,y of X, ¥ 
and bounds B;., Bry 


Pane xy ~ XV] [xy — &— €xg)Y — €y)| _ | Cx y + EyX — Exky 
xy xy xy 
Exy + EyX Ex Ey 
wf = Jal t fy] = level + beryl S Bre + Bry: 


This proof shows what “approximately” means: we neglected €,€, as small in absolute 
value compared to |e,,| and le,,|. The proof for the quotient is similar but slightly more 
tricky (see Prob. 13). =) 
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Basic Error Principle 


Every numeric method should be accompanied by an error estimate. If such a formula is 
lacking, is extremely complicated, or is impractical because it involves information (for 
instance, on derivatives) that is not available, the following may help. 


Error Estimation by Comparison. Do a calculation twice with different accuracy. 
Regard the difference Gz, — Gy of the results ay, dg as a (perhaps crude) estimate of the 
error €, of the inferior result a,. Indeed, ad, + €1 = Gg + €g by formula (4*). This implies 
dz — Gy, = €1 — €2 ~ €, because Gg is generally more accurate than Gj, so that leés| is 
small compared to lez|. 


Algorithm. Stability 


Numeric methods can be formulated as algorithms. An algorithm is a step-by-step 
procedure that states a numeric method in a form (a “pseudocode”) understandable to 
humans. (See Table 19.1 to see what an algorithm looks like.) The algorithm is then used 
to write a program in a programming language that the computer can understand so that 
it can execute the numeric method. Important algorithms follow in the next sections. For 
routine tasks your CAS or some other software system may contain programs that you 
can use or include as parts of larger programs of your own. 


Stability. To be useful, an algorithm should be stable; that is, small changes in the initial 
data should cause only small changes in the final results. However, if small changes in the 
initial data can produce large changes in the final results, we call the algorithm unstable. 

This “numeric instability,’ which in most cases can be avoided by choosing a better 
algorithm, must be distinguished from “mathematical instability” of a problem, which is 
called “ill-conditioning,” a concept we discuss in the next section. 

Some algorithms are stable only for certain initial data, so that one must be careful in 
such a case. 


PROBLEEM—SET 19-1 


1. 


Floating point. Write 84.175, —528.685, 0.000924138, 
and —362005 in floating-point form, rounded to 5S 
(5 significant digits). 


. Write —76.437125, 60100, and —0.00001 in floating- 


point form, rounded to 4S. 


. Small differences of large numbers may be parti- 


cularly strongly affected by rounding errors. Illustrate 
this by computing 0.81534/(35 » 724 — 35.596) as 
given with 5S, then rounding stepwise to 4S, 3S, and 2S, 
where “stepwise” means round the rounded numbers, not 
the given ones. 


. Order of terms, in adding with a fixed number of 


digits, will generally affect the sum. Give an example. 
Find empirically a rule for the best order. 


5. Rounding and adding. Let a, - -- , a, be numbers with 


a; correctly rounded to S$; digits. In calculating the sum 
a, +++: + dy, retaining S = min S; significant digits, 
is it essential that we first add and then round the result 
or that we first round each number to S significant digits 
and then add? 


. Nested form. Evaluate 


f@) =x? — 7.5x2 + 11.2x + 2.8 
= ((x — 7.5)x + 11.2)x + 2.8 


at x = 3.94 using 3S arithmetic and rounding, in both 
of the given forms. The latter, called the nested form, 
is usually preferable since it minimizes the number of 
operations and thus the effect of rounding. 
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Quadratic equation. Solve x2 — 30x + 1 = 0 by (4) 
and by (5), using 6S in the computation. Compare and 
comment. 


8. Solve x2 — 40x + 2 = 0, using 4S-computation. 


11. 
12. 


13. 
14. 


15. 


16. 


17. 


18. 


19. 


20. 


21. 


. Do the computations in Prob. 7 with 4S and 2S. 
. Instability. For small |a| the equation (x — kK? =a 


has nearly a double root. Why do these roots show 
instability? 

Theorems on errors. Prove Theorem | (a) for addition. 
Overflow and underflow can sometimes be avoided 
by simple changes in a formula. Explain this in terms 
of Vx? + y? =xV1+ (y/x)* with x? = y? and x so 
large that x” would cause overflow. Invent examples 
of your own. 


Division. Prove Theorem 1(b) for division. 


Loss of digits. Square root. Compute Vx? + 4 — 2 
with 6S arithmetic for x = 0.001 (a) as given and 
(b) from x2/(Vx? + 4 + 2) (derive!). 


Logarithm. Compute Ina — In b with 6S arithmetic 
for a = 4.00000 and b = 3.99900 (a) as given and 
(b) from In(a/b). 

Cosine. Compute 1 — cosx with 6S arithmetic for 
x = 0.02 (a) as given and (b) by 2 sin” x (derive!). 
Discuss the numeric use of (12) in App. A3.1 for 
cos v — cos u when u ~ v. 


Quotient near 0/0. (a) Compute (1 — cos x)/sin x with 
6S arithmetic for x = 0.005. (b) Looking at Prob. 16, 
find a much better formula. 


Exponential function. Calculate 1/e = 0.367879 (6S) 
from the partial sums of 5—10 terms of the Maclaurin 
series (a) of e~” with x = 1, (b) of e” with x = 1 and 
then taking the reciprocal. Which is more accurate? 

1 


Compute e~ ° with 6S arithmetic in two ways (as in 
Prob. 19). 
Binary conversion. Show that 
23 = 20-10'+ 3-10°= 16+4+2+1 
=244+2774214+29=(1 01 1 Lode 

can be obtained by the division algorithm 

2|23 Remainder 1 = co 

211 l=c 

25 1=ce 

212. 0 = ¢3 

0 l=c4 


22. 


23. 
24. 


25. 


26. 


27. 


28. 


29. 


30. 
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Convert (0.59375)19 to (0.10011)2 by successive 
multiplication by 2 and dropping (removing) the integer 
parts, which give the binary digits c,, co,--- : 


0 59375 +2 
c, = [1] 1875 - 2 
co = [0] 375-2 
cz = [0] .75 +2 
ca = [1] .5-2 
c5 = 0.0 


Show that 0.1 is not a binary machine number. 


Prove that any binary machine number has a finite 
decimal representation. Is the converse true? 


CAS EXPERIMENT. Approximations. Obtain 
x=01 = ; > 2-4” from Prob. 23. Which machine 
m=1 


number (partial sum) S,, will first have the value 0.1 
to 30 decimal digits? 


CAS EXPERIMENT. Integration from Calculus. 
Integrating by parts, show that J, = Sy ex" dx = 
e — nlyn_-1, 19 = e — 1. (a) Compute J,,n = 0,-°-, 
using 4S arithmetic, obtaining Jg = —3.906. Why is 
this nonsense? Why is the error so large? 

(b) Experiment in (a) with the number of digits k > 4. 
As you increase k, will the first negative value n = N 
occur earlier or later? Find an empirical formula for 
N = N(k). 


Backward Recursion. In Prob. 26. Using e” < e 
(0 < x < 1), conclude that |/,| S e/(n + 1) 0 as 
n—, Solve the iteration formula for J,-1 = 
(e — I,)/n, start from [15 ~ 0 and compute 4S values 
of Tha, 113, ee Ih. 


Harmonic series. | + 3 + 4 +--+ diverges. Is the 
same true for the corresponding series of computer 
numbers? 


Approximations of 7 = 3.14159265358979 --- are 
22/7 and 355/113. Determine the corresponding errors 
and relative errors to 3 significant digits. 


Compute 77 by Machin’s approximation 16 arctan 
(5) — 4 arctan (545) to 10S (which are correct). [In 
1986, D. H. Bailey (NASA Ames Research Center, 
Moffett Field, CA 94035) computed almost 30 million 
decimals of 77 on a CRAY-2 in less than 30 hrs. The 
race for more and more decimals is continuing. See the 
Internet under pi.] 
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19.2 Solution of Equations by Iteration 


For each of the remaining sections of this chapter, we select basic kinds of problems and 

discuss numeric methods on how to solve them. The reader will learn about a variety of 

important problems and become familiar with ways of thinking in numerical analysis. 
Perhaps the easiest conceptual problem is to find solutions of a single equation 


(1) f(x) = 0, 


where f is a given function. A solution of (1) is a number x = s such that f(s) = 0. Here, 
s suggests “solution,” but we shall also use other letters. 

It is interesting to note that the task of solving (1) is a question made for numeric 
algorithms, as in general there are no direct formulas, except in a few simple cases. 

Examples of single equations are x? + x = 1, sin x = 0.5x, tan.x = x, cosh x = sec x, 
cosh x cos x = —1, which can all be written in the form of (1). The first of the five equations 
is an algebraic equation because the corresponding f is a polynomial. In this case the 
solutions are called roots of the equation and the solution process is called finding roots. The 
other equations are transcendental equations because they involve transcendental functions. 

There are a very large number of applications in engineering, where we have to solve a 
single equation (1). You have seen such applications when solving characteristic equations 
in Chaps. 2, 4, and 8; partial fractions in Chap. 6; residue integration in Chap. 16, finding 
eigenvalues in Chap. 12, and finding zeros of Bessel functions, also in Chap. 12. Moreover, 
methods of finding roots are very important in areas outside of classical engineering. For 
example, in finance, the problem of determining how much a bond is worth amounts to 
solving an algebraic equation. 

To solve (1) when there is no formula for the exact solution available, we can use an 
approximation method, such as an iteration method. This is a method in which we start from 
an initial guess x9 (which may be poor) and compute step by step (in general better and better) 
approximations x1, X2,°+- of an unknown solution of (1). We discuss three such methods that 
are of particular practical importance and mention two others in the problem set. 

It is very important that the reader understand these methods and their underlying ideas. 
The reader will then be able to select judiciously the appropriate software from among 
different software packages that employ variations of such methods and not just treat the 
software programs as “black boxes.” 

In general, iteration methods are easy to program because the computational operations 
are the same in each step—just the data change from step to step—and, more importantly, 
if in a concrete case a method converges, it is stable in general (see Sec. 19.1). 


Fixed-Point Iteration for Solving Equations f(x) = 0 


Note: Our present use of the word “fixed point” has absolutely nothing to do with that in 
the last section. 
By some algebraic steps we transform (1) into the form 


(2) x = g(x). 
Then we choose an x9 and compute x; = g(%9), X2 = g(xj), and in general 


(3) Xnt1 = 8%n) (n = 0, 1,---). 
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A solution of (2) is called a fixed point of g, motivating the name of the method. This is a 
solution of (1), since from x = g(x) we can return to the original form f(x) = 0. From (1) 
we may get several different forms of (2). The behavior of corresponding iterative sequences 
Xo, X4,°*+ may differ, in particular, with respect to their speed of convergence. Indeed, some 
of them may not converge at all. Let us illustrate these facts with a simple example. 

An Iteration Process (Fixed-Point Iteration) 


Set up an iteration process for the equation f(x) = x? — 3x + 1 = 0. Since we know the solutions 
#=15 2=V 1.25, thus 2.618034 and 0.381966, 


we can watch the behavior of the error as the iteration proceeds. 


Solution. The equation may be written 
(4a) X= g1(x) = 4 (x? +1), thus Xn+1 = 4 (x2 eh). 


If we choose x9 = 1, we obtain the sequence (Fig. 426a; computed with 6S and then rounded) 


xo = 1.000, x, = 0.667, x2=0481, x3=0411, x4 =0.390,-:- 


which seems to approach the smaller solution. If we choose x9 = 2, the situation is similar. If we choose 
Xq = 3, we obtain the sequence (Fig. 426a, upper part) 


xo = 3.000, x1 = 3.333, 9x3 = 4.037, x3 = 5.766, x4 = 1115, + 


which diverges. 
Our equation may also be written (divide by x) 


1 1 
(4b) X= g(x) =3-—, thus Xn41 =3-—, 
x Xu 


and if we choose xg = 1, we obtain the sequence (Fig. 426b) 
xo = 1.000, xy, >= 2.000, xg > 2.500, x3 > 2.600, x4 2.615, a8% 


which seems to approach the larger solution. Similarly, if we choose x9 = 3, we obtain the sequence 
(Fig. 426b) 


Xo = 3.000, x1 = 2.667, Xg = 2.625, X3 = 2.619, X4 = 2.618,:--. 


x 
(a) (b) 
Fig. 426. Example 1, iterations (4a) and (4b) 
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Our figures show the following. In the lower part of Fig. 426a the slope of g;(x) is less than the slope of y = x, 
which is 1, thus |gj(x)| < 1, and we seem to have convergence. In the upper part, g(x) is steeper (g4(x) > 1) 
and we have divergence. In Fig. 426b the slope of go(x) is less near the intersection point (x = 2.618, fixed 
point of go, solution of f(x) = 0), and both sequences seem to converge. From all this we conclude that 
convergence seems to depend on the fact that, in a neighborhood of a solution, the curve of g(x) is less steep 
than the straight line y = x, and we shall now see that this condition |g’(x)| < 1 © slope of y = x) is sufficient 
for convergence. I] 


An iteration process defined by (3) is called convergent for an x9 if the corresponding 
sequence x9, X1,°** 1S convergent. 

A sufficient condition for convergence is given in the following theorem, which has 
various practical applications. 


Convergence of Fixed-Point Iteration 


Let x = s be a solution of x = g(x) and suppose that g has a continuous derivative 
in some interval J containing s. Then, if |g'(x)| S K <1 in J, the iteration process 
defined by (3) converges for any xg in J. The limit of the sequence {xy} is S. 


By the mean value theorem of differential calculus there is a ¢ between x and s such that 


g(x) — g(s) = g(x — 8) (x in J). 
Since g(s) = s and x1 = g(%o), X2 = g(*4),°**, we obtain from this and the condition on 
|e’(x)| in the theorem 
lxn — sl = [g@n—1) — 8(9)| = le’Ollan—1 — 8] S Klan — sl. 
Applying this inequality n times, for n,n — 1,---, 1 gives 
lxn — S| SKlxy_-1 — | S K7lx,-2 — s| S + SK"xqQ —s]. 
Since K < 1, we have K" 0; hence |x,, — s| ~Oasn—>~, 


We mention that a function g satisfying the condition in Theorem | is called a contraction 
because | g(x) — g(v)| = K|x — v|, where K < 1. Furthermore, K gives information on 
the speed of convergence. For instance, if K = 0.5, then the accuracy increases by at least 
2 digits in only 7 steps because 0.5’ < 0.01. 


An Iteration Process. Illustration of Theorem 1 
Find a solution of f(x) = x° + x — 1 = 0 by iteration. 


Solution. A sketch shows that a solution lies near x = 1. (a) We may write the equation as 2 1)x = lor 
ry q 


1 ; 2\-x| 
x = g(x) = ——_.,, so that Xn41 = —. Also |g4(x)| = ———— 
1+ x? 1+ x2 (1 + x2)? 
for any x because 4x?/(1 + x24 4x7/(1 + 4x? +--+.) <1, so that by Theorem 1| we have convergence for 


any Xo. Choosing x9 = 1, we obtain (Fig. 427) 
x1 = 0.500, x2 >= 0.800, x3 > 0.610, x4 = 0.729, x5 = 0.653, xg = 0:701,-"%. 


The solution exact to 6D is s = 0.682328. 
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(b) The given equation may also be written 
xX = go(x) = 1 - x? Then |go(x)| = 3x2 


and this is greater than | near the solution, so that we cannot apply Theorem | and assert convergence. Try 
Xo = 1, x9 = 0.5, x9 = 2 and see what happens. 

The example shows that the transformation of a given f(x) = 0 into the form x = g(x) with g satisfying 
|g/(x) S K < 1 may need some experimentation. 


Fig. 427. Iteration in Example 2 Fig. 428. Newton’s method 


Newton’s Method for Solving Equations f(x) = 0 


Newton’s method, also known as Newton-Raphson’s method,’ is another iteration 
method for solving equations f(x) = 0, where fis assumed to have a continuous derivative f’ . 
The method is commonly used because of its simplicity and great speed. 

The underlying idea is that we approximate the graph of f by suitable tangents. Using 
an approximate value x obtained from the graph of f, we let x1 be the point of intersection 
of the x-axis and the tangent to the curve of f at xg (see Fig. 428). Then 


tan B = f'(xo) = Te ‘ hence X1 = Xo 10) 
0 


= x4 fC) 


In the second step we compute x2 = x1 — f(x,)/f (x1), in the third step x3 from x again 
by the same formula, and so on. We thus have the algorithm shown in Table 19.1. Formula 
(5) in this algorithm can also be obtained if we algebraically solve Taylor’s formula 


(5*) Pasa) ~fGa) + Gud af Oy) =O 


1JOSEPH RAPHSON (1648-1715), English mathematician who published a method similar to Newton’s 
method. For historical details, see Ref. [GenRef2], p. 203, listed in App. 1. 
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Table 19.1 Newton’s Method for Solving Equations f(x) = 0 


ALGORITHM NEWTON ( f, f Xo, €, N) 
This algorithm computes a solution of f(x) = 0 given an initial approximation Xo (starting 
value of the iteration). Here the function f(x) is continuous and has a continuous 


derivative f x). 


INPUT: f, f ie initial approximation x9, tolerance € > 0, maximum number of 
iterations N. 


OUTPUT: Approximate solution x, (n = N) or message of failure. 


For n = 0, 1, 2,°:-,N— 1 do: 
1 Compute f (x,,). 
2 If f (x,) = 0 then OUTPUT “Failure.” Stop. 


[Procedure completed unsuccessfully] 


3 Else compute 
f(&n) 
5 a8 OX ae oe ren 
( ) n+1 n f'n) 
4 If |xn+1 — Xn| S €lxn41| then OUTPUT x,,,1. Stop. 


[Procedure completed successfully] 


End 


2) OUTPUT “Failure”. Stop. 


[Procedure completed unsuccessfully after N iterations] 


End NEWTON 


If it happens that f’(x,) = 0 for some n (see line 2 of the algorithm), then try another 
starting value x9. Line 3 is the heart of Newton’s method. 

The inequality in line 4 is a termination criterion. If the sequence of the x,, converges 
and the criterion holds, we have reached the desired accuracy and stop. Note that this is just 
a form of the relative error test. It ensures that the result has the desired number of significant 
digits. If |x,,,1| = 0, the condition is satisfied if and only if x,,,1 = x, = 0, otherwise 
|xn+1 — X»| must be sufficiently small. The factor |x,,41| is needed in the case of zeros 
of very small (or very large) absolute value because of the high density (or of the scarcity) 
of machine numbers for those x. 


WARNING! The criterion by itself does not imply convergence. Example. The 
harmonic series diverges, although its partial sums x, = Sj 1/k satisfy the criterion 
because lim (¥y +1 — Xn) = lim (1/(n + 1)) = 0. 
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EXAMPLE 4 


EXAMPLE 5 


Line 5 gives another termination criterion and is needed because Newton’s method may 
diverge or, due to a poor choice of x9, may not reach the desired accuracy by a reasonable 
number of iterations. Then we may try another x9. If f(x) = 0 has more than one solution, 
different choices of x9 may produce different solutions. Also, an iterative sequence may 
sometimes converge to a solution different from the expected one. 


Square Root 


Set up a Newton iteration for computing the square root x of a given positive number c and apply it to c = 2. 


Solution. We have x = Vc, hence f(x) x2 -—C¢ 0, f’(x) = 2x, and (5) takes the form 


2 
xn — C 1 _ ec 
Xx x x. 3 
n+1 n 2X n y ni? 5, 


For c = 2, choosing x9 = 1, we obtain 


x1 = 1.500000, x» = 1.416667, x3 = 1.414216, xq = 1.414214,---. 


x4 is exact to 6D. B 


Iteration for a Transcendental Equation 
Find the positive solution of 2 sin x = x. 


Solution. Setting f(x) = x — 2 sin x, we have f(x) = 1 — 2 cos x, and (5) gives 


Xn —2sinx, 2(sinx, —Xy,cosxn) Ny 


Xn+1 Xn ¢ 
1 = 20s: xy 1—2cosxy Dy; 


n Xn Nn Dy Xn+1 


2.00000 3.48318 1.83229 1.90100 
1.90100 3.12470 1.64847 1.89552 
1.89552 3.10500 1.63809 1.89550 
1.89550 3.10493 1.63806 1.89549 


WN e © 


From the graph of f we conclude that the solution is near x9 = 2. We compute: 
Xq = 1.89549 is exact to 5D since the solution to 6D is 1.895494. ie] 


Newton’s Method Applied to an Algebraic Equation 
Apply Newton’s method to the equation f(x) = x? + x - 1=0. 


Solution. From (5) we have 


xot+ixn—-1 2x3 +1 


Xn+1 Xn 


3x2 +1 3x2 + 1 
Starting from x9 = 1, we obtain 
x 1 = 0.750000, xo = 0.686047, x3 = 0.682340, x4 = 0.682328,:-- 


where x4 has the error —1 + 107°. A comparison with Example 2 shows that the present convergence is much 
more rapid. This may motivate the concept of the order of an iteration process, to be discussed next. | 
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Order of an Iteration Method. 
Speed of Convergence 


The quality of an iteration method may be characterized by the speed of convergence, as 
follows. 

Let xn+1 = g(%y) define an iteration method, and let x, approximate a solution s of 
x = g(x). Then x, = s — €y, where €, is the error of x,,. Suppose that g is differentiable 
a number of times, so that the Taylor formula gives 


Xn+1 = gin) = as) + g'(S\an — 8) +38" (S)On — SPR +0 


(6) 
= a(s) — g's)en + 38" (en +0 

The exponent of €,, in the first nonvanishing term after g(s) is called the order of the 
iteration process defined by g. The order measures the speed of convergence. 

To see this, subtract g(s) = s on both sides of (6). Then on the left you get x,+1 — s = 
—€n+1, where €,+1 is the error of x,,+1. And on the right the remaining expression equals 
approximately its first nonzero term because |e,,| is small in the case of convergence. 
Thus 

(a) €n4, = + g'(s)en in the case of first order, 
(7) 


(b) €n+41 = ge" (s)e in the case of second order, etc. 


Thus if €,, = 10~* in some step, then for second order, €,41; = c°* (ior =¢r 10 
so that the number of significant digits is about doubled in each step. 


Convergence of Newton’s Method 
In Newton’s method, g(x) = x — f(x)/f (x). By differentiation, 


fa? = fOof"® 
g(x) = 1 ee 
(8) ff) 
_ fof" 
FOP 
Since f(s) = 0, this shows that also g(s) = (0. Hence Newton’s method is at least of second 
order. If we differentiate again and set x = s, we find that 


Qe te 
(8*) g (5) Fo 


which will not be zero in general. This proves 


Second-Order Convergence of Newton’s Method 


If f(x) is three times differentiable and f' and f" are not zero at a solution s of 
f(x) = 0, then for xo sufficiently close to s, Newton’s method is of second order. 
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EXAMPLE 6 


EXAMPLE 7 


Comments. For Newton’s method, (7b) becomes, by (8*), 


FS) 2 
2f'(s) 


For the rapid convergence of the method indicated in Theorem 2 it is important that s be 
a simple zero of f(x) (thus f ‘(s) # 0) and that xo be close to s, because in Taylor’s formula 
we took only the linear term [see (5*)], assuming the quadratic term to be negligibly small. 
(With a bad x9 the method may even diverge!) 


(9) En+1 ~ 


Prior Error Estimate of the Number of Newton Iteration Steps 


Use x9 = 2 and x, = 1.901 in Example 4 for estimating how many iteration steps we need to produce the 
solution to 5D-accuracy. This is an a priori estimate or prior estimate because we can compute it after only 
one iteration, prior to further iterations. 


Solution. We have f(x) = x — 2 sin x = 0. Differentiation gives 


f(s) fxr) 2 sin x, 
= = 0.57. 
2f'(s) 2f"ary) 2. — 2 cos x4) 
Hence (9) gives 
len+1| ~ 0.572 ~ 0.57(0.57e7_1)” = 0.578et_y ~ ++: ~ O.5TMeMt1 <5 - 1076 
where M = 2" + 2% 14 --- +24+1=2"*1— 1. We show below that €9 ~ —0.11. Consequently, our 


condition becomes 
0.57”0.11"*1 = 5- 107%. 
Hence n = 2 is the smallest possible n, according to this crude estimate, in good agreement with Example 4. 


€9 ~ —0.11 is obtained from €; — €g = (€1 — 5) — (€g — 5S) X1 + x9 ~ 0.10, hence €y = €9 + 0.10 ~ 
—0.57e% or 0.5762 + €9 + 0.10 ~ 0, which gives €g ~ —0.11. a] 


Difficulties in Newton’s Method. Difficulties may arise if |f’(x)| is very small near a 
solution s of f(x) = 0. For instance, let s be a zero of f(x) of second or higher order. Then 
Newton’s method converges only linearly, as is shown by an application of |’ Hopital’s rule 
to (8). Geometrically, small | f ‘(x)| means that the tangent of f(x) near s almost coincides 
with the x-axis (so that double precision may be needed to get f(x) and f’(x) accurately 
enough). Then for values x = s far away from s we can still have small function values 


R(s) = f(s). 


In this case we call the equation f(x) = 0 ill-conditioned. R(‘s) is called the residual of 
f(x) = 0 at 5. Thus a small residual guarantees a small error of 5 only if the equation is 
not ill-conditioned. 


An Ill-Conditioned Equation 


f@ = x® + 1074x = 0 is ill-conditioned, x = 0 is a solution. f'(0) = 1074 is small. At ¥ = 0.1 the residual 
fOA) =2- 107° is small, but the error —0.1 is larger in absolute value by a factor 5000. Invent a more drastic 
example of your own. B 


Secant Method for Solving f(x) = 0 


Newton’s method is very powerful but has the disadvantage that the derivative f’ may 
sometimes be a far more difficult expression than f itself and its evaluation therefore 
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computationally expensive. This situation suggests the idea of replacing the derivative 
with the difference quotient 


= f@n) — f&n-1) 


f'@n) * 3 


Then instead of (5) we have the formula of the popular secant method 


> pn +10 *n n-l 


Fig. 429. Secant method 


Xn ~ Xn-1 


(10) Sera = sep. — Cia) = ee 
. mee" fGen) — fOn—v 

Geometrically, we intersect the x-axis at x,,;, with the secant of f(x) passing through 

P,,—1 and P,, in Fig. 429. We need two starting values x9 and x1. Evaluation of derivatives 

is now avoided. It can be shown that convergence is superlinear (that is, more rapid than 

linear, 1.62. 


ee || = const - ley, ; see [E5] in App. 1), almost quadratic like Newton’s 
method. The algorithm is similar to that of Newton’s method, as the student may show. 


CAUTION! It is not good to write (10) as 


Xn-1f (Xn) — Xnf(Xn-1) 
f@n) — f&n-1) : 


Xn+1 = 


because this may lead to loss of significant digits if x,, and x,_, are about equal. (Can 
you see this from the formula?) 
Secant Method 


Find the positive solution of f(x) = x — 2 sin x = 0 by the secant method, starting from x9 = 2,x1 = 1.9. 


Solution. Here, (10) is 


Gn — 2 SiNXn)O%n — Xn—1) Nn 
Xn+1 Xn Xn . 
Xn — Xn—-1 + 2(sin xy-1 — sin x») Dy 
Numeric values are: 
n Xn-1 Xn Ny, Dy Xn+1 — Xn 
1 2.000000 1.900000 —0.000740 —0.174005 —0.004253 
2 1.900000 1.895747 —0.000002 —0.006986 —0.000252 
3 1.895747 1.895494 0 0 


X3 = 1.895494 is exact to 6D. See Example 4. | 
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Summary of Methods. 
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The methods for computing solutions s of f(x) = O with given 


continuous (or differentiable) f(x) start with an initial approximation xo of s and generate 
a sequence x1, %9,°-:by iteration. Fixed-point methods solve f(x) = 0 written as 
x = g(x), so that s is a fixed point of g, that is, s = g(s). For g(x) = x — f/f’ this is 
Newton’s method, which, for good x9 and simple zeros, converges quadratically (and for 
multiple zeros linearly). From Newton’s method the secant method follows by replacing 
f'(®) by a difference quotient. The bisection method and the method of false position in 
Problem Set 19.2 always converge, but often slowly. 


PROBLEM —SET 19-2 


1-13 


FIXED-POINT ITERATION 


Solve by fixed-point iteration and answer related 
questions where indicated. Show details. 


1; 


2. 


3. 


Monotone sequence. Why is the sequence in Example 1 
monotone? Why not in Example 2? 

Do the iterations (b) in Example 2. Sketch a figure 
similar to Fig. 427. Explain what happens. 


f=x-—05cosx = 0, xo = 1. Sketch a figure. 


4. f = x — cosec x the zero near x = 1. 


5. Sketch f(x) = x? 


11. 


12. 


13. 


. Find the smallest positive solution of sinx = e~”. 


5.00x? + 1.01x + 1.88, showing 
roots near +1 and 5. Write x = g(x) = (5.00x? = 
1.01x + 1.88)/x?. Find a root by starting from x9 = 
5, 4, 1, —1. Explain the (perhaps unexpected) results. 


. Find a form x = g(x) of f(x) = 0 in Prob. 5 that yields 


convergence to the root near x = 1. 
x 


. Solve x* — x — 0.12 = 0 by starting from x9 = 1. 
. Find the negative solution of x*-—x-012=0. 
10. 


Elasticity. Solve xcoshx = 1. (Similar equations 
appear in vibrations of beams; see Problem Set 12.3.) 


Drumhead. Bessel functions. A partial sum of the 
Maclaurin series of Jo(x) (Sec. 5.5) is f(x) = 1 — ax + 
eax? = a50ax°. Conclude from a sketch that f(x) = 0 
nearx = 2. Write f(x) = Oasx = g(x) (by dividing f(x) 
by dx and taking the resulting x-term to the other side). 
Find the zero. (See Sec. 12.10 for the importance of these 


ZerOS.) 


CAS EXPERIMENT. Convergence. Let f(x) = x? + 
2x? — 3x — 4 = 0. Write this as x = g(x), for g choos- 
ne Gon C) GP Sif) Ol aeay 
(4) @? — f)/x?, (5) (2x? — f)/(2x), and (6) x — f/f’ 
and in each case x9 = 1.5. Find out about convergence 
and divergence and the number of steps to reach 6S- 
values of a root. 


Existence of fixed point. Prove that if g is continuous 
in a closed interval J and its range lies in /, then the 
equation x = g(x) has at least one solution in /. Illustrate 
that it may have more than one solution in J. 


14-23 


NEWTON’S METHOD 


Apply Newton’s method (6S-accuracy). First sketch the 
function(s) to see what is going on. 


14 


15 


16. 
17. 


18. 


19. 


20. 
21. 
22. 


23. 


24. 


. Cube root. Design a Newton iteration. Compute 
W7, xo = 2s 
. f = 2x — cosx, XxXg = 1. Compare with Prob. 3. 
What happens in Prob. 15 for any other x9? 
Dependence on xo. Solve Prob. 5 by Newton’s method 
with x9 = 5,4, 1, —3. Explain the result. 
Legendre polynomials. Find the largest root of 
the Legendre polynomial P3(x) given by P53(x) = 
4 (63x° — 70x? + 15x) (Sec. 5.3) (to be needed in 
Gauss integration in Sec. 19.5) (a) by Newton’s 
method, (b) from a quadratic equation. 
Associated Legendre functions. Find the smallest posi- 
tive zero of P3 =(1 x2) Py = 15 ( 7x4 + 8x2 — 1) 
(Sec. 5.3) (a) by Newton’s method, (b) exactly, by 
solving a quadratic equation. 


x+Inx =2, x9 =2 
f=x?-5x+3=0, x9 =2, 0,-2 
Heating, cooling. At what time x (4S-accuracy only) will 


the processes governed by f,(x) = 100(1 — e 9?) and 
fox) = 40e~°9!” reach the same temperature? Also 
find the latter. 

Vibrating beam. Find the solution of cos x cosh x = 1 
near x = 3m. (This determines a frequency of a 
vibrating beam; see Problem Set 12.3.) 

Method of False Position (Regula falsi). Figure 430 
shows the idea. We assume that f is continuous. We 
compute the x-intercept co of the line through 
(ao, f(ao)), (bo, f(bo)). Tf f(co) = 0, we are done. If 
(do) f(co) < 0 (as in Fig. 430), we set ay = do, by = Co 
and repeat to get c,, etc. If f(ag)f(co) > 0, then 
(co) f(bo) < 0 and we set ay = co, by = bo, ete. 

(a) Algorithm. Show that 


dof (bo) — bo f(ao) 
f(bo) — f(ao) 


and write an algorithm for the method. 


Co = 
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Fig. 430. Method of false position 


(b) Solve xt = 2, cosx = Vx, andx + Inx = 2, with 
a=1,b=2. 


TEAM PROJECT. Bisection Method. This simple but 
slowly convergent method for finding a solution of 


f(x) = 0 with continuous fis based on the intermediate 


value theorem, which states that if a continuous function 


has opposite signs at some x = a and x = b(> a), that 


is, either f(a) < 0, f(b) > 0 or f@ > 0, f(b) < 0, then f 


19.3 Interpolation 


must be 0 somewhere on [a, b]. The solution is found 
by repeated bisection of the interval and in each iteration 
picking that half which also satisfies that sign condition. 
(a) Algorithm. Write an algorithm for the method. 
(b) Comparison. Solve x = cos x by Newton’s method 
and by bisection. Compare. 


(c) Solve e&* = Inxand e* + x* + x = 2 by bisection. 
26-29| SECANT METHOD 
Solve, using x9 and x, as indicated: 
26. e°* —tanx =0, xg =1, x1 = 0.7 
27. Prob. 21, x9 = 1.0, x1 = 2.0 
28. x =cosx, x9 = 0.5, x, =1 
29. sinx =cotx, xg=1, x1 =0.5 


30. 


WRITING PROJECT. Solution of Equations. 
Compare the methods in this section and problem set, 
discussing advantages and disadvantages in terms of 
examples of your own. No proofs, just motivations and 
ideas. 


We are given the values of a function f(x) at different points xo, x1,°°+, xn. We want to 
find approximate values of the function f(x) for “new” x’s that lie between these points 
for which the function values are given. This process is called interpolation. The student 
should pay close attention to this section as interpolation forms the underlying foundation 
for both Secs. 19.4 and 19.5. Indeed, interpolation allows us to develop formulas for 
numeric integration and differentiation as shown in Sec. 19.5. 

Continuing our discussion, we write these given values of a function f in the form 


fo = fo): 


or as ordered pairs 


(Xo, fo): 


Ai =f@y), +, 


(x1, f1) Beer 


Sn = fn) 


(Xn, fn): 


Where do these given function values come from? They may come from a “mathematical” 
function, such as a logarithm or a Bessel function. More frequently, they may be measured 
or automatically recorded values of an “empirical” function, such as air resistance of a 
car or an airplane at different speeds. Other examples of functions that are “empirical” 
are the yield of a chemical process at different temperatures or the size of the U.S. 
population as it appears from censuses taken at 10-year intervals. 

A standard idea in interpolation now is to find a polynomial p,, (x) of degree n (or less) 


that assumes the given values; thus 


(1) Pnixo) = fo: 


Prlx1) = fi aoe 


Pr(Xn) = fn- 


We call this p, an interpolation polynomial and xo,---, x, the nodes. And if f(x) is a 
mathematical function, we call p,, an approximation of f (or a polynomial approximation, 
because there are other kinds of approximations, as we shall see later). We use py, to get 
(approximate) values of f for x’s between x9 and x,, (“interpolation”) or sometimes outside 
this interval x9 S x S x, (“extrapolation”). 
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Motivation. Polynomials are convenient to work with because we can readily differentiate 
and integrate them, again obtaining polynomials. Moreover, they approximate continuous 
functions with any desired accuracy. That is, for any continuous f(x) on an interval 
J:a =x =b and error bound B > 0, there is a polynomial p,,(x) (of sufficiently high 
degree n) such that 


| f(x) — pn)| < B for all x on J. 


This is the famous Weierstrass approximation theorem (for a proof see Ref. [GenRef7], 
App. 1). 


Existence and Uniqueness. Note that the interpolation polynomial p,, satisfying (1) for 
given data exists and we shall give formulas for it below. Furthermore, p,, is unique: 
Indeed, if another polynomial gq, also satisfies g,(xo) = fo.'*:,>9nn) = fn, then 
Pr(X) — dn(X) = Oat x9, +++, Xp, but a polynomial p,, — q,, of degree n (or less) with n + 1 
roots must be identically zero, as we know from algebra; thus p,,.(x) = q(x) for all x, which 
means uniqueness. a 


How Do We Find p,,? We shall explain several standard methods that give us py. By 
the uniqueness proof above, we know that, for given data, the different methods must give 
us the same polynomial. However, the polynomials may be expressed in different forms 
suitable for different purposes. 


Lagrange Interpolation 


Given (xo, fo), %1.f1).°**. Ons fn) with arbitrarily spaced x;, Lagrange had the idea of 
multiplying each f; by a polynomial that is | at x; and 0 at the other n nodes and then 
taking the sum of these n + | polynomials. Clearly, this gives the unique interpolation 
polynomial of degree n or less. Beginning with the simplest case, let us see how this 
works. 


Linear interpolation is interpolation by the straight line through (x9, fo), (1, f1); see 
Fig. 431. Thus the linear Lagrange polynomial p, is a sum py = Lofo + Lif, with Lo 
the linear polynomial that is 1 at xg and O at x1; similarly, Ly is 0 at x9 and | at x4. 


Obviously, 
ie as pe ey 
Lo) = 55x17 L1@) 55 = x0" 
This gives the linear Lagrange polynomial 
X— X1 X — XQ 
(2) Pil) = Lo@)fo + OA = 3 = yy Jot eax, 
: Error 
| 
| } y =f{(x) 
| | | 
Hy z z 


Fig. 431. Linear Interpolation 
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EXAMPLE 1 Linear Lagrange Interpolation 


Compute a 4D-value of In 9.2 from In 9.0 = 2.1972, In 9.5 = 2.2513 by linear Lagrange interpolation and 
determine the error, using In 9.2 = 2.2192 (4D). 


Solution. xo = 9.0, x, = 9.5, fo = In 9.0, fy = In 9.5. Ln (2) we need 


Lo(x) 7 2.0(x — 9.5), — Lo(9.2) = —2.0(—0.3) = 0.6 


L(x) = ras =2.0(x- 9.0), 1,(9.2)=2-02=04 


(see Fig. 432) and obtain the answer 


In 9.2 ~ px(9.2) = Lo(9.2) fo + L1(9.2) fy = 0.6 + 2.1972 + 0.4 + 2.2513 = 2.2188. 


The error is € = a — d = 2.2192 — 2.2188 = 0.0004. Hence linear interpolation is not sufficient here to get 
4D accuracy; it would suffice for 3D accuracy. | 


Fig. 432. Lo and L, in Example 1 


Quadratic interpolation is interpolation of given (xo, fo), (1, fi), (%2, fe) by a second- 
degree polynomial po(x), which by Lagrange’s idea is 


(3a) Px) = Lo@)fo + LiQfr + Lea(x)fo 


with Lo(xo) = 1, Liv) = 1, Lo(xe2) = 1, and Lo(x1) = Lo(x2) = 0, etc. We claim that 


Lo(x) = Lox) _ (x — x4)(% — x2) 

° lo(xo) (Xo — x1)(Xo — X2) 
_ 14(x) = (x — xo)(X — X2) 

a meh ae ee eee 
L(x) = Io(x) _ (x — xo)(x — x4) 


In(xg) (x2 — Xo)(X2 — X1) 
How did we get this? Well, the numerator makes L;,(x;) = 0 ifj # k. And the denominator 


makes L;,(x,) = 1 because it equals the numerator at x = x;,. 


EXAMPLE 2 Quadratic Lagrange Interpolation 
Compute In 9.2 by (3) from the data in Example | and the additional third value In 11.0 = 2.3979. 
Solution. In (3), 


io oe TN 2 bese Ode «6 Exo) S400, 
(9.0 — 9.5)(9.0 — 11.0) 
= = 1 
nije eS (x2 — 20x + 99), L4(9.2) = 0.4800, 
(9.5 — 9.09.5 — 11.0) 0.75 
— = 1 
Lo(x) ere Ae) (x2 — 18.5x + 85.5), La(9.2) = —0.0200, 


(11.0 — 9.0)(11.0— 9.5) 3 
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(see Fig. 433), so that (3a) gives, exact to 4D, 
In 9.2 = po(9.2) = 0.5400 - 2.1972 + 0.4800 + 2.2513 — 0.0200 + 2.3979 = 2.2192. al 


Fig. 433. Lo, Ly, Lz in Example 2 


General Lagrange Interpolation Polynomial. For general n we obtain 


(4a) fO=piO= > Lf, = > aa 


k=0 k=0 


ti 


where L;,(x,) = 1 and Ly is 0 at the other nodes, and the L;, are independent of the function 
f to be interpolated. We get (4a) if we take 


Ie) = (Ce = ate = cilers (Gs > ah), 
(4b) UKE) = (Cs = ae) OCs = sep NCE = senee)) 82° (C2 = sea) 0<k<na, 


In(x) = (& — xo)(% — x1) ++*@ — Xn-1)- 


We can easily see that py, (x) = fx. Indeed, inspection of (4b) shows that /;,(x;) = 0 if 
j # k, so that for x = x,, the sum in (4a) reduces to the single term (/,(x)//K%)) fk = Si: 


Error Estimate. If f is itself a polynomial of degree n (or less), it must coincide 
with p,, because the n + 1| data (xo, fo),:**, (Xn, fn) determine a polynomial uniquely, 
so the error is zero. Now the special f has its (7 + 1)st derivative identically zero. This 
makes it plausible that for a general f its (n + 1)st derivative f "+) should measure the 
error 


En(x) = SX) ~ Py(%). 
It can be shown that this is true if f ™*D exists and is continuous. Then, with a suitable 
t between xo and x,, (or between xo, x, and x if we extrapolate), 


wie iG 


(5) En(X) = FX) — Pn(X) = (& — xo) — X1) °° — Xn) a 


Thus |e,,(x)| is 0 at the nodes and small near them, because of continuity. The product 
(x — X9)*+:(* — Xp) is large for x away from the nodes. This makes extrapolation risky. 
And interpolation at an x will be best if we choose nodes on both sides of that x. Also, 
we get error bounds by taking the smallest and the largest value of f *D(t) in (5) on the 
interval x9 = tf = xy (or on the interval also containing x if we extrapolate). 
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Most importantly, since p,, is unique, as we have shown, we have 


Error of Interpolation 


Formula (5) gives the error for any polynomial interpolation method if f(x) has a 
continuous (n + 1)st derivative. 


Practical error estimate. If the derivative in (5) is difficult or impossible to obtain, apply 
the Error Principle (Sec. 19.1), that is, take another node and the Lagrange polynomial 
Pn+1(x) and regard p,,+1(x) — py(x) as a (crude) error estimate for p,,(x). 


Error Estimate (5) of Linear Interpolation. Damage by Roundoff. Error Principle 

Estimate the error in Example | first by (5) directly and then by the Error Principle (Sec. 19.1). 

Solution. (A) Estimation by (5). We have n = 1, f(t) = Int, f(t) = 1/t, f"®) = —1/t?. Hence 
(-1) 


0.03 
€1(x) = (x — 9.0)(x — 9.5) ; thus €1(9.2) = —. 
21 1? 


t = 0.9 gives the maximum 0.03/9? = 0.00037 and t = 9.5 gives the minimum 0.03/9.5” = 0.00033, so that 
we get 0.00033 S €1(9.2) S 0.00037, or better, 0.00038 because 0.3/81 = 0.003703 -:-. 

But the error 0.0004 in Example | disagrees, and we can learn something! Repetition of the computation there 
with 5D instead of 4D gives 


In 9.2 © p4(9.2) = 0.6 + 2.19722 + 0.4 + 2.25129 = 2.21885 


with an actual error € = 2.21920 — 2.21885 = 0.00035, which lies nicely near the middle between our two 
error bounds. 

This shows that the discrepancy (0.0004 vs. 0.00035) was caused by rounding, which is not taken into account 
in (5). 

(B) Estimation by the Error Principle. We calculate p,(9.2) = 2.21885 as before and then po(9.2) as in 
Example 2 but with 5D, obtaining 


P2(9.2) = 0.54 + 2.19722 + 0.48 + 2.25129 — 0.02 - 2.39790 = 2.21916. 


The difference po(9.2) — p1(9.2) = 0.00031 is the approximate error of p;(9.2) that we wanted to obtain; this 
is an approximation of the actual error 0.00035 given above. fe] 


Newton’s Divided Difference Interpolation 


For given data (x9, fo),:**. (Xn; fn) the interpolation polynomial p,,(x) satisfying (1) is 
unique, as we have shown. But for different purposes we may use p,,(x) in different forms. 
Lagrange’s form just discussed is useful for deriving formulas in numeric differentiation 
(approximation formulas for derivatives) and integration (Sec. 19.5). 

Practically more important are Newton’s forms of p,,(x), which we shall also use for solving 
ODEs (in Sec. 21.2). They involve fewer arithmetic operations than Lagrange’s form. 
Moreover, it often happens that we have to increase the degree n to reach a required accuracy. 
Then in Newton’s forms we can use all the previous work and just add another term, a 
possibility without counterpart for Lagrange’s form. This also simplifies the application of 
the Error Principle (used in Example 3 for Lagrange). The details of these ideas are as follows. 

Let py—1(x) be the (n — 1)st Newton polynomial (whose form we shall determine); 
thus py—-1(Xo0) = fo. Pn—-14 1) = fi. **s Pn—-14n—-1) = fn—1. Furthermore, let us write the 
nth Newton polynomial as 


(6) Pr) = Pn-1) + 8n(X); 
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(6) 8n(X) = Pn(X) — Pn-12). 


Here g(x) is to be determined so that py(xo) = fo, Pn) = ft.***, Pn@&n) = fn- 

Since py, and py- 1 agree at x0,°**,xXn-1, we see that g, is zero there. Also, g, will 
generally be a polynomial of nth degree because so is py, whereas py— 1 can be of degree 
n— 1 at most. Hence g,, must be of the form 


(6") SnlX) = An(X — XOX — X1)°+*°@ — Xn-1)- 


We determine the constant a,,. For this we set x = x», and solve (6”) algebraically for ay. 
Replacing g,(x,) according to (6’) and using pp(xn) = fr, we see that this gives 


tn — Pn-1.Xn) 


Gn — Xo\%n — X1) °° Bn — Xn-1) ; 


(7) ay = 


We write a, instead of a,, and show that a;, equals the kth divided difference, recursively 
denoted and defined as follows: 


Ai =f 
ay = f[xo, x1] = 
= _ f[%1, x2] — f[xo, ¥1] 
ag = f[Xo, 1, 2] <7 
and in general 
[x °° "4X ] = [x "4 =all 
(8) aie = Fle, ++ x4) = Ren 


Ifn = 1, then py—-1@n) = po(*1) = fo because po(x) is constant and equal to fo, the value 
of f(x) at x9. Hence (7) gives 


fi — Po1) _ fi — fo 


X1 ~ Xo X1 ~ Xo 


a= = flxo, x1], 

and (6) and (6”) give the Newton interpolation polynomial of the first degree 
Pix) = fo + (« — xo) f [xo x11. 

If n = 2, then this py and (7) give 


f2 — Pie) to = fo ~ 2 = Xo) flXo, Xi) _ 


(x2 — xo\(%2 — x3) (xg — Xo)(Xqg — x4) 


dz = f (xo, 1, x2] 


where the last equality follows by straightforward calculation and comparison with the 
definition of the right side. (Verify it; be patient.) From (6) and (6”) we thus obtain the 
second Newton polynomial 
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P2lx) = fo + (x — Xo) f[X0, X1] + (X — Xo) — xy) f[X0, X1, Xe]. 


For n = k, formula (6) gives 
(9) P(X) = Pe—-1%) + (% — Xo) — xy) — XK-DF[X0,°**» Xe: 


With po(x) = fo by repeated application with k = 1,---,n this finally gives Newton’s 
divided difference interpolation formula 


(10) F(X) ~ fo + & — Xo)f[Xo, x1] + & — xo) — x1) f[X0, «1, x2] 
ete gle = ha) Ce kya) Loe ee 
An algorithm is shown in Table 19.2. The first do-loop computes the divided differences 
and the second the desired value p,,(X). 
Example 4 shows how to arrange differences near the values from which they are 
obtained; the latter always stand a half-line above and a half-line below in the preceding 
column. Such an arrangement is called a (divided) difference table. 


Table 19.2 Newton’s Divided Difference Interpolation 


ALGORITHM INTERPOL (xo,°**, Xni fort**> fn D 
This algorithm computes an approximation p,,(x) of f(x) at x. 
INPUT: Data (Xo, fo), (1, fi)s*** > Qi fn)s X 
OUTPUT: Approximation p,,(x) of f(x) 

Set flxj] = f; (7 = 0,---, 7). 
For m = 1,:--,n — 1) do: 
For j = 0,:-+,2 — m do: 


F1Xj410°°*  Xj4m] — fp ++ Xj4-m-1] 


Xj+m — Xj 


flXj° ++ Xj4m] = 
End 
End 


Set po(x) = fo- 
For k = 1,-::,ndo: 


Pr(X) = Pp) + & — Xo) +++ @ — X_ DF os» Xe] 


End 
OUTPUT p,,(x) 
End INTERPOL 
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EXAMPLE 4 _ Newton’s Divided Difference Interpolation Formula 
Compute f(9.2) from the values shown in the first two columns of the following table. 
x; ds = f(x) hiker X41] [5 Xj+d> Xj+2] ks rine) Xj+3] 


80 
9.0 2.197225 


0.108134 
9.5 2.251292 —0.005200 
0.097735 
11.0 2.397895 


Solution. We compute the divided differences as shown. Sample computation: 
(0.097735 — 0.108134)/(11 — 9) = —0.005200. 


The values we need in (10) are circled. We have 


F(X) © p3(x) = 2.079442 + 0.117783(x — 8.0) — 0.006433(x — 8.0)(x — 9.0) 


+ 0.000411(x — 8.0)(x — 9.0)(x — 9.5). 


At x = 9.2, 


f(9.2) ~ 2.079442 + 0.141340 — 0.001544 — 0.000030 = 2.219208. 


The value exact to 6D is f(9.2) = In 9.2 = 2.219203. Note that we can nicely see how the accuracy increases 


from term to term: 


Pi(9.2) = 2.220782, p2(9.2) = 2.219238, p3(9.2) = 2.219208. 


Equal Spacing: Newton's Forward Difference Formula 


Newton’s formula (10) is valid for arbitrarily spaced nodes as they may occur in practice in 
experiments or observations. However, in many applications the x;’s are regularly spaced— 
for instance, in measurements taken at regular intervals of time. Then, denoting the distance 


by h, we can write 


(11) Xo Xp=Xo th, xg =XQ9 + 2h, +++, Xn =X t+ nh. 


We show how (8) and (10) now simplify considerably! 
To get started, let us define the first forward difference of f at x; by 


Af; = Siar a Sip 
the second forward difference of f at x; by 
A“fi = Mfier — Mf 


and, continuing in this way, the kth forward difference of f at x; by 


(12) A‘f = Ae = Ale 
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Examples and an explanation of the name “forward” follow on the next page. What is the 
point of this? We show that if we have regular spacing (11), then 


il 
(13) flto-~,4%] =—_,, A. 
kth 


We prove (13) by induction. It is true for k = 1 because x1 = xg + h, so that 


ae ee (fi — fo) = : Afo. 


X41 — XO h ~ Ith 


f[xo, *1)] = 


Assuming (13) to be true for all forward differences of order k, we show that (13) holds for 
k + 1. We use (8) with k + 1 instead of k; then we use (kK + lh = xx~41 — Xo, resulting 
from (11), and finally (12) with j = 0, that is, A**4f = A*f, — A*fo. This gives 


fl . ] fl%1.°++s Xe41) — f[X0.° ++ XK] 
‘ X0; »Xk4+1 (k Ae Dh 
1 1 k 1 k 
= A A 
a+ pn lam py So 
1 k+1 
ee 
(k + 1tn*t 
which is (13) with k + 1 instead of k. Formula (13) is proved. | 


In (10) we finally set x =x9 + rh. Then x — x9 = rh, x — xy =(r— Wh since 
X1 —X9 =/h, and so on. With this and (13), formula (10) becomes Newton’s (or 
Gregory’—Newton’s) forward difference interpolation formula 


SEN E 
ed) se — (“lass G19 7 F(X) 2) 
=0 
(14) : 
= Il = jooo@—msr ll 
= fy + rMfy +“ ary + pee) 2 " d Arf, 
4 nt 
where the binomial coefficients in the first line are defined by 
—-Dor-2)--r7-st+1 
(15) (") = 1, (") =i ts : vs ) (s > 0, integer) 
Ss Ay 


and s!} = 1+2-:--s, 
Error. From (5) we get, withx — x9 = rh, x — x, =(r-— DA, ete., 


Tata 


(16) En(x) = f(x) — Py) = (r—-D er — ny fr rp 


fea 


with ¢ as characterized in (5). 


2JAMES GREGORY (1638-1675), Scots mathematician, professor at St. Andrews and Edinburgh. A in (14) 
and V” (on p. 818) have nothing to do with the Laplacian. 
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EXAMPLE 5 


Formula (16) is an exact formula for the error, but it involves the unknown ¢. In 
Example 5 (below) we show how to use (16) for obtaining an error estimate and an 
interval in which the true value of f(x) must lie. 


Comments on Accuracy. (A) The order of magnitude of the error €,,(x) is about equal 
to that of the next difference not used in p,(x). 


(B) One should choose xo, ---, 2, such that the x at which one interpolates is as well 
centered between xo,°-:, xX, aS possible. 
The reason for (A) is that in (16), 


An t@ rea De Goal 2 


: if lr] <1 
prt 1-2-:-(n +1) 


7" 2g 
(and actually for any r as long as we do not extrapolate). The reason for (B) is that 
Ir(r — 1) «+ (r — n)| becomes smallest for that choice. 


Newton’s Forward Difference Formula. Error Estimation 


Compute cosh 0.56 from (14) and the four values in the following table and estimate the error. 


i X;j f; = cosh x; Af; 
0 0.5 1.127626 
0.057839 


1 0.6 1.185465 0.011865 
0.069704 0.000697 


2 0.7 1.255169 0.012562 
0.082266 


AST A, 


3 0.8 1.337435 


Solution. We compute the forward differences as shown in the table. The values we need are circled. In (14) 
we have r = (0.56 — 0.50)/0.1 = 0.6, so that (14) gives 


0.6(—0.4) 0.6(—0.4)(— 1.4) 


cosh 0.56 ~ 1.127626 + 0.6 + 0.057839 + + 0.011865 + + 0.000697 


1.127626 + 0.034703 — 0.001424 + 0.000039 


1.160944. 


Error estimate. From (16), since the fourth derivative is cosh ¢ = cosh t, 


0.14 
€3(0.56) FF + 0.6(—0.4)(— 1.4)(—2.4) cosh t 


= Acosht, 


where A = —0.00000336 and 0.5 S t S 0.8. We do not know f, but we get an inequality by taking the largest 
and smallest cosh ¢ in that interval: 


A cosh 0.8 S €3(0.62) S A cosh 0.5. 
Since 


f(x) = ps) + €3(2), 


818 


EXAMPLE 6 


CHAP. 19 Numerics in General 


this gives 
p3(0.56) + A cosh 0.8 S cosh 0.56 S p3(0.56) + A cosh 0.5. 
Numeric values are 
1.160939 S cosh 0.56 S 1.160941. 


The exact 6D-value is cosh 0.56 = 1.160941. It lies within these bounds. Such bounds are not always so tight. 
Also, we did not consider roundoff errors, which will depend on the number of operations. 


This example also explains the name “forward difference formula”: we see that the 
differences in the formula slope forward in the difference table. 


Equal Spacing: Newton’s Backward Difference Formula 


Instead of forward-sloping differences we may also employ backward-sloping differ- 
ences. The difference table remains the same as before (same numbers, in the same 
positions), except for a very harmless change of the running subscript j (which we explain 
in Example 6, below). Nevertheless, purely for reasons of convenience it is standard to 
introduce a second name and notation for differences as follows. We define the first 
backward difference of f at x; by 


Whi =i ~ fi-1 
the second backward difference of f at x; by 
Vf = Vi — Vi 


and, continuing in this way, the kth backward difference of f at x; by 
(17) Vp = VG tan (k= 1,2,+**), 


A formula similar to (14) but involving backward differences is Newton’s (or 
Gregory—Newton’s) backward difference interpolation formula 


Mfr+s—1 
fo) ~ pat = ("*S ) re (x = x9 + rh,r = (x — xo)/h) 


” +1)--(rtn-1 
eee) 


n! 


V"fo- 


Newton’s Forward and Backward Interpolations 


Compute a 7D-value of the Bessel function Jo(x) for x = 1.72 from the four values in the following table, using 
(a) Newton’s forward formula (14), (b) Newton’s backward formula (18). 
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Jee wigack Xj Jo(x;) Ist Diff. 2nd Diff. 3rd Diff. 
0 =3 1.7 0.3979849 
—0.0579985 
1 —2 1.8 0.3399864 —0.0001693 
—0.0581678 0.0004093 
2 =ill 1.9 0.2818186 0.0002400 
—0.0579278 
3 0 2.0 0.2238908 


Solution. The computation of the differences is the same in both cases. Only their notation differs. 


(a) Forward. In (14) we have r = (1.72 — 1.70)/0.1 = 0.2, and j goes from 0 to 3 (see first column). In 
each column we need the first given number, and (14) thus gives 


Jo (1.72) ~ 0.3979849 + 0.2(—0.0579985) + 


0.2(—0.8) 0.2(—0.8)(— 1.8) 


(—0.0001693) + + 0.0004093 


= 0.3979849 — 0.0115997 + 0.0000135 + 0.0000196 = 0.3864183, 


which is exact to 6D, the exact 7D-value being 0.3864185. 


(b) Backward. For (18) we use j shown in the second column, and in each column the last number. Since 
r = (1.72 — 2.00)/0.1 = —2.8, we thus get from (18) 


Jo(1.72) ~ 0.2238908 — 2.8(—0.0579278) + 


—2.8(—1.8) 
2 


- 0.0002400 + a 


+ 0.0004093 


= 0.2238908 + 0.1621978 + 0.0006048 — 0.0002750 


= 0.3864184. 


There is a third notation for differences, called the central difference notation. It 
is used in numerics for ODEs and certain interpolation formulas. See Ref. [ES] listed in 


App. 1. 


PROBLEM SET 19-3 


1. 


. Linear and quadratic interpolation. Find e~® 


Linear interpolation. Calculate p(x) in Example 1 
and from it In 9.3. 


. Error estimate. Estimate the error in Prob. | by (5). 
. Quadratic interpolation. Gamma function. Calculate 


the Lagrange polynomial po(x) for the values 
T(1.00) = 1.0000, ['(1.02) = 0.9888, 11.04) = 0.9784 
of the gamma function [(24) in App. A3.1] and from it 
approximations of (1.01) and ['(1.03). 


. Error estimate for quadratic interpolation. Estimate 


the error for po(9.2) in Example 2 from (5). 


25 and 


eee by linear interpolation of e~” with x9 = 0, 
x1 = 0.5 and x9 = 0.5, x; = 1, respectively. Then find 
P2(x) by quadratic interpolation of e~” with x9 = 0, 
x, =0.5,x2 = 1 and from it e 95 and e7 O79. 


Compare the errors. Use 4S-values of e~”. 


6. 


Interpolation and extrapolation. Calculate po(x) in 
Example 2. Compute from it approximations of 
In 9.4, In 10, In 10.5, In 11.5, and In 12. Compute the 
errors by using exact 5S-values and comment. 


. Interpolation and extrapolation. Find the quadratic 


polynomial that agrees with sin x at x = 0, 7/4, 7/2 
and use it for the interpolation and extrapolation of sin x 
at x = —77/8, 7/8, 377/8, 57/8. Compute the errors. 


. Extrapolation. Does a sketch of the product of the 


(x — xj) in (5) for the data in Example 2 indicate that 
extrapolation is likely to involve larger errors than 
interpolation does? 


. Error function (35) in App. A3.1. Calculate the 


Lagrange polynomial po(x) for the 5S-values f(0.25) = 
0.27633, f(0.5) = 0.52050, f(1.0) = 0.84270 and from 
P2(x) an approximation of f(0.75) (= 0.71116). 
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10. 
11. 


12. 


13 


14. 


15. 
16. 


17. 
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Error bound. Derive an error bound in Prob. 9 from (5). 
Cubic Lagrange interpolation. Bessel function Jo. 
Calculate and graph Lo, Ly, Lo,L3 with xo = 0, 
xy = 1,x2 = 2,x3 = 3 on common axes. Find p3(x) 
for the data (0,1), (1, 0.765198), (2, 0.223891), 
(3, —0.260052) [values of the Bessel function Jo(x)]. 
Find p3 for x = 0.5, 1.5, 2.5 and compare with the 6S- 
exact values 0.938470, 0.511828, —0.048384. 
Newton’s forward formula (14). Sine integral. Using 
(14), find (1.25) by linear, quadratic, and cubic 
interpolation of the data (values of (40) in App. A31); 6S- 
value Si(1.25) = 1.14645) f(1.0) = 0.94608, f(1.5) = 
1.32468, f(2.0) = 1.60541, f(2.5) = 1.77852, and com- 
pute the errors. For the linear interpolation use f(1.0) 
and f(1.5), for the quadratic f(1.0), (1.5), (2.0), etc. 
Lower degree. Find the degree of the interpolation 
polynomial for the data (—4, 50), (—2, 18), (0, 2), (2, 2), 
(4, 18), using a difference table. Find the polynomial. 
Newton’s forward formula (14). Gamma function. 
Set up (14) for the data in Prob. 3 and compute ['(1.01), 
11.03), 71.05). 

Divided differences. Obtain pz in Example 2 from (10). 
Divided differences. Error function. Compute po(0.75) 
from the data in Prob. 9 and Newton’s divided difference 
formula (10). 

Backward difference formula (18). Use po(x) in (18) 
and the values of erf x, x = 0.2, 0.4, 0.6 in Table A4 of 
App. 5, compute erf 0.3 and the error. (4S-exact erf 0.3 = 
0.3286). 


18. 


19. 


20. 


21. 


In Example 5 of the text, write down the difference table 
as needed for (18), then write (18) with general x and 
then with x = 0.56 to verify the answer in Example 5. 
CAS EXPERIMENT. Adding Terms in Newton 
Formulas. Write a program for the forward formula 
(14). Experiment on the increase of accuracy by 
successively adding terms. As data use values of some 
function of your choice for which your CAS gives the 
values needed in determining errors. 

TEAM PROJECT. Interpolation and Extrapolation. 
(a) Lagrange practical error estimate (after Theo- 
rem 1). Apply this to p,(9.2) and po(9.2) for the data 
Xo 9.0, x4 9.5, x9 11.0, fo = In xo. fi oa In x4, 
fo = In xg (6S-values). 

(b) Extrapolation. Given (x;, f(x;)) = (0.2, 0.9980), 
(0.4, 0.9686), (0.6, 0.8443), (0.8, 0.5358), (1.0, 0). Find 
f(0.7) from the quadratic interpolation polynomials 
based on (a) 0.6, 0.8, 1.0, (8) 0.4, 0.6, 0.8, (y) 0.2, 0.4, 
0.6. Compare the errors and comment. [Exact f(x) = 
cos (5 7x”), f(0.7) = 0.7181 (4S).] 

(c) Graph the product of factors (x — xj) in the error 
formula (5) for n = 2,---,10 separately. What do 
these graphs show regarding accuracy of interpolation 
and extrapolation? 

WRITING PROJECT. Comparison of interpolation 
methods. List 4—5 ideas that you feel are most important 
in this section. Arrange them in best logical order. 
Discuss them in a 2-3 page report. 


19.4 Spline Interpolation 


Given data (function values, points in the xy-plane) (Xo, fo), (%1, f1),°**, On» fn) can be 
interpolated by a polynomial P,(x) of degree n or less so that the curve of P,,(x) passes 
through these n + 1 points (x;, fj); here fo = f(xo), ++ Sn = f(%n), See Sec. 19.3. 

Now if 7 is large, there may be trouble: P,,(x) may tend to oscillate for x between the nodes 
X0,°**, Xn. Hence we must be prepared for numeric instability (Sec. 19.1). Figure 434 shows 
a famous example by C. Runge? for which the maximum error even approaches °° as n > © 
(with the nodes kept equidistant and their number increased). Figure 435 illustrates the increase 
of the oscillation with n for some other function that is piecewise linear. 

Those undesirable oscillations are avoided by the method of splines initiated by I. J. 
Schoenberg in 1946 (Quarterly of Applied Mathematics 4, pp. 45-99, 112-141). This 
method is widely used in practice. It also laid the foundation for much of modern CAD 
(computer-aided design). Its name is borrowed from a draftman’s spline, which is an 
elastic rod bent to pass through given points and held in place by weights. The mathematical 
idea of the method is as follows: 


3CARL RUNGE (1856-1927), German mathematician, also known for his work on ODEs (Sec. 21.1). 
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Fig. 435. Piecewise linear function f(x) and interpolation polynomials of increasing degrees 


Instead of using a single high-degree polynomial P,, over the entire intervala = x Sb 
in which the nodes lie, that is, 


(1) A=X9<xXy< + SX, =|, 
we use 7 low-degree, e.g., cubic, polynomials 


qo(x), qua), a) dn—-1(%), 


one over each subinterval between adjacent nodes, hence go from x9 to x1, then g; from 
X 1 to xg, and so on. From this we compose an interpolation function g(x), called a spline, 
by fitting these polynomials together into a single continuous curve passing through the 
data points, that is, 


(2) g(xo) =f@o) =f sav =fevd=afA, +, 8@n) = fOr) = fr 


Note that g(x) = go(x) when xp S x S x4, then g(x) = qy(x) when xy =x S Xog, and so 
on, according to our construction of g. 

Thus spline interpolation is piecewise polynomial interpolation. 

The simplest q;’s would be linear polynomials. However, the curve of a piecewise linear 
continuous function has corners and would be of little interest in general—think of 
designing the body of a car or a ship. 

We shall consider cubic splines because these are the most important ones in applications. 
By definition, a cubic spline g(x) interpolating given data (xo, fo), ***, An, fy) is acontinuous 
function on the interval a = x9 Sx Sx, = D that has continuous first and second 
derivatives and satisfies the interpolation condition (2); furthermore, between adjacent nodes, 
g(x) is given by a polynomial q;(x) of degree 3 or less. 

We claim that there is such a cubic spline. And if in addition to (2) we also require that 


(3) g'(xo) = ko, g'(xn) = kn 
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(given tangent directions of g(x) at the two endpoints of the interval a S x S b), then we 
have a uniquely determined cubic spline. This is the content of the following existence 
and uniqueness theorem, whose proof will also suggest the actual determination of splines. 
(Condition (3) will be discussed after the proof.) 


Existence and Uniqueness of Cubic Splines 


Let (xo, fo), %1,f1)s°**, (Xn fn) with given (arbitrarily spaced) x; [see (1)] and 
given f; = f(x;),j = 9, 1,-++,n. Let kg and k,, be any given numbers. Then there 
is one and only one cubic spline g(x) corresponding to (1) and satisfying (2) 
and (3). 


By definition, on every subinterval J; given by x; = x S xj+1, the spline g(x) must agree 
with a polynomial g;(x) of degree not exceeding 3 such that 


(4) qjlxj) = fj), Gajev = frj+v (j = 0,1,:--,n — 1). 
For the derivatives we write 
(5) qj(xj) = kj, Gj(xj+1) = kj+a (j= 0,1,°-:,n — 1) 


with ko and k,, given and ky,---,k,—1 to be determined later. Equations (4) and (5) are 
four conditions for each q;(x). By direct calculation, using the notation 


1 1 6s 
(6*) cae ee er G=0,1,-::,2-1) 
we can verify that the unique cubic polynomial q;(x) (j = 0, 1,---,m — 1) satisfying (4) 
and (5) is 


gilx) = fx cF(x — x541)7L1 + 2e(x — x))] 


+i Gee = ayy Ll 26 = age) 


(6) 


+ kjcFx — xi) — xja1)” 


+ kgs ice(x = xx — Xj41)- 
Differentiating twice, we obtain 
7) Gj ej) = OCF f(x) + OCF (xj41) — 4eykj — 2epkj+1 
(8) Gi (jai) = OCF (x) — OcFfxj+1) + Bejky + 4ejkja1- 
By definition, g(x) has continuous second derivatives. This gives the conditions 


Gj—-1j) = gj (x4) G=1,-:+,n- 0. 
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If we use (8) with j replaced by j — 1, and (7), these n — 1 equations become 
(9) Cree al AP 2(G=1 ar ck; ste Ginn = 3[c7_1Vfj ap FV fir 


where Vf; = (xj) — f(vj—1) and Vfj41 = f(vja1) — f(x) andj = 1,---,n — 1, as before. 
This linear system of n — | equations has a unique solution ky, +--+, k,,—1 since the coefficient 
matrix is strictly diagonally dominant (that is, in each row the (positive) diagonal entry is 
greater than the sum of the other (positive) entries). Hence the determinant of the matrix 
cannot be zero (as follows from Theorem 3 in Sec. 20.7), so that we may determine unique 
values ky,---, k,,— of the first derivative of g(x) at the nodes. This proves the theorem. 


Storage and Time Demands in solving (9) are modest, since the matrix of (9) is sparse 
(has few nonzero entries) and tridiagonal (may have nonzero entries only on the diagonal 
and on the two adjacent “parallels” above and below it). Pivoting (Sec. 7.3) is not necessary 
because of that dominance. This makes splines efficient in solving large problems with 
thousands of nodes or more. For some literature and some critical comments, see American 
Mathematical Monthly 105 (1998), 929-941. 


Condition (3) includes the clamped conditions 
(10) 8(x0) =f%o), —-8'n) = f'n), 


in which the tangent directions f'(xg) and f‘(x,) at the ends are given. Other conditions 
of practical interest are the free or natural conditions 


(11) g(x) = 0, — g"(Xn) = 0 


(geometrically: zero curvature at the ends, as for the draftman’s spline), giving a natural 
spline. These names are motivated by Fig. 293 in Problem Set 12.3. 


Determination of Splines. Let kg and k,, be given. Obtain k1,---,k,—1 by solving the 
linear system (9). Recall that the spline g(x) to be found consists of n cubic polynomials 


qo>'**>4n—1- We write these polynomials in the form 
(12) CHES) = Gy ar GIGS = 889) ae Ces = ye te aj3% > By 
where j = 0,---,2—1. Using Taylor’s formula, we obtain 
ajo = Xj) = fj by (2), 
aj, = G(xj) = ky by (5), 
_ lp Se, 1 
(13) aja 5 Gi (Ce) = 2 Cian ip) = i (kj44 + 2kj) by (7), 
j J 


1 z 1 
aig = — G5 i) = 53 (Gi fir) + pe (kina +) 
4] a] 
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with aj3 obtained by calculating Gj(xj+1) from (12) and equating the result to (8), 
that is, 


6 2 
Gi(%j+1) = 2ajn + Gajghy = Fe fi ~ fir) + 5 (kj + kya), 
j j 
and now subtracting from this 2a; as given in (13) and simplifying. 


Note that for equidistant nodes of distance hj = h we can write c; = c = 1/h in (6*) 
and have from (9) simply 


3 : 
(14) Kees oe Gee oe (ei = h Gist > Jai), (= 1,°:-,n = 1). 


Spline Interpolation. Equidistant Nodes 


Interpolate f(x) = x* on the interval —1 S x < 1 by the cubic spline g(x) corresponding to the nodes x9 = —1, 
x1 = 0,x2 = 1 and satisfying the clamped conditions g'(—1) = f’(—1), ¢’(1) = f(D. 


Solution. 1 our standard notation the given data are fy = f(—1) = 1,f, = f(0) = 0,f2 =f) = 1. 
We have fh = | and n = 2, so that our spline consists of n = 2 polynomials 


1 1 


do(x) = doo + dove + 1) + doa(x 4 1)? + apg(x + 1)? (-12x = 0), 


qu(x) = ayo + ayyx 4 dyox” t ay3x? (= x= I). 


We determine the k; from (14) (equidistance!) and then the coefficients of the spline from (13). Since n = 2, 
the system (14) is a single equation (with 7 = 1 and h = 1) 


ko + 4ky + ko = 3( fo — fo)- 


Here fo = fo = | (the value of x* at the ends) and ko = —4, ka = 4, the values of the derivative Ax® at the 
ends —1 and 1. Hence 


44+4k,+4=30-1)=0, k,=0. 


From (13) we can now obtain the coefficients of go, namely, d99 = fo = 1, 401 = ko 4, and 


3 1 
a2 2 ‘fi fo) pee 30 —-1)-(- 8) =5 


2 1 
03 B (fo — fa) + 2 ft + ko) = 201 — 0) + (0 — 4) 2. 


Similarly, for the coefficients of g, we obtain from (13) the values ajp = fy = 0, ayy = ky = O, and 


a2 = 3( fe — fi) — (ko + 2k1) = 30. — 0) — (4 + 0) 1 
a3 = 2(fi — fo) + (ko + ky) = 200 — 1) + (4 + 0) = 2. 


This gives the polynomials of which the spline g(x) consists, namely, 


g(x) = 1-40 + 1) +54 1? - 204 1 x27 -2x3 if -1Sx0 
wo ={ 


qu(x) = —x? + 2x if 0Sx72 1, 


Figure 436 shows f(x) and this spline. Do you see that we could have saved over half of our work by using 
symmetry? a 


SEC. 19.4 Spline Interpolation 825 


-l a 1 x 


g(x) 


Fig. 436. Function f(x) = x* and cubic spline g(x) in Example 1 


EXAMPLE 2. Natural Spline. Arbitrarily Spaced Nodes 


Find a spline approximation and a polynomial approximation for the curve of the cross section of the circular- 
shaped Shrine of the Book in Jerusalem shown in Fig. 437. 


Fig. 437. Shrine of the Book in Jerusalem (Architects F. Kissler and A. M. Bartus) 


Solution. Thirteen points, about equally distributed along the contour (not along the x-axis!), give these data: 


A 5.8 5.0 4.0 2.5, 1.5 08 0 08 15 25 40 50 5.8 
fi 0 1.5 1.8 2:2 2.7 35 39 35 27 22 18 15 0 


The figure shows the corresponding interpolation polynomial of 12th degree, which is useless because of its 
oscillation. (Because of roundoff your software will also give you small error terms involving odd powers of x.) 
The polynomial is 


Pio(x) = 3.9000 — 0.65083x2 + 0.033858x* + 0.011041x® — 0.0014010x8 
+ 0.000055595x?° — 0.00000071867x7?. 


The spline follows practically the contour of the roof, with a small error near the nodes —0.8 and 0.8. The spline 
is symmetric. Its six polynomials corresponding to positive x have the following coefficients of their 
representations (12). (Note well that (12) is in terms of powers of x — xj, not x!) 


J x-interval ayo Gy dye, Gj3 

0 0.0...0.8 3.9 0.00 —0.61 —0.015 
1 0.8...1.5 3.5 —1.01 —0.65 0.66 
2 1.5.:.2:5 2.7 —0.95 0.73 —0.27 
3 2.5...4.0 2.2 —0.32 —0.091 0.084 
4 4.0...5.0 1.8 —0.027 0.29 —0.56 
a 5.0...5.8 1.5 —1.13 —1.39 0.58 
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PROBLEM SET 19-4 


1. WRITING PROJECT. Splines. In your own words, 12. fo=fO)=1, A =fQ=9 fo=f(4 = 41, 


and using as few formulas as possible, write a short fg =f(6) = 41, ko =0, kg = —-12 
report on spline interpolation, its motivation, a 13. fo =fO=1. fA=f)=0, f=f) 1, 
comparison with polynomial interpolation, and its fs =fB)=0, ko =0, kg 6 
applications. 14. fo =f) =2, fi =f) =3, fo =f) =8, 
fs =f@3) = 12, ko =kg =0 
2-9| VERIFICATIONS. DERIVATIONS. 15. fo =f) =4, fi =f2=0, fp=f(4) =4, 
COMPARISONS tz = f(6) = 80, ko = kg = 0 
2. Individual polynomial g;. Show that g(x) in (6) 16 fo= f(0) = 2, Ar =fQ) 2, fo = f(A) = 2, 
satisfies the interpolation condition (4) as well as the fs = f(6) = 78, ko = ks = 0. Can you obtain the 
derivative condition (5). answer from that of Prob. 15? 
3. Verify the differentiations that give (7) and (8) from (6). 17. If a cubic spline is three times continuously differen- 


tiable (that is, it has continuous first, second, and third 
derivatives), show that it must be a single polynomial. 


18. CAS EXPERIMENT. Spline versus Polynomial. If 
your CAS gives natural splines, find the natural splines 
when x is integer from —m to m, and y(0) = 1 and all 
other y equal to 0. Graph each such spline along with 


= 


. System for derivatives. Derive the basic linear system 
(9) for ky,°+-, k,—1 as indicated in the text. 

5. Equidistant nodes. Derive (14) from (9). 

6. Coefficients. Give the details of the derivation of aj 

and aj3 in (13). 


7. Verify the computations in Example 1. the interpolation polynomial p2,,. Do this form = 2 to 
8. Comparison. Compare the spline g in Example | with 10 (or more). What happens with increasing m? 
the quadratic interpolation polynomial over the whole 19. Natural conditions. Explain the remark after (11). 


interval. Find the maximum deviations of g and po 


20. TEAM PROJECT. Hermite Interpolation and Bezier 
from f. Comment. 


Curves. In Hermite interpolation we are looking for 
a polynomial p(x) (of degree 2n + 1 or less) such that 
p(x) and its derivative p'(x) have given values at n + 1 


\e 


. Natural spline condition. Using the given coefficients, 
verify that the spline in Example 2 satisfies g(x) = 0 


at the ends. nodes. (More generally, p(x), p(x), p"(x),:++may be 
required to have given values at the nodes.) 

10-16 | DETERMINATION OF SPLINES (a) Curves with given endpoints and tangents. Let 
Find the cubic spline g(x) for the given data with kg and C be a curve in the xy-plane parametrically represented 
kp as given. by r(@) = [x(), yO], 0 St = 1 (see Sec. 9.5). Show 
10. f(—2) = f(-D) =f) =f2)=0, fO =1 that for given initial and terminal points of a curve and 

ko = ka = 0 : ; given initial and terminal tangents, say, 
11. If we started from the piecewise linear function in A: ro = [x(0), y(0)] 
Fig. 438, we would obtain g(x) in Prob. 10 as the spline u 
satisfying g(—2) = f(—2) =0, ¢(2)=f'2) =0. = [xo, Yol. 
Find and sketch or graph the corresponding interpolation B: ry = [x(), yQ)] 
polynomial of 4th degree and compare it with the spline. _ 
C = [¥1,. yi] 
omment. 
Vo = [x'(0), yO) 
= [xo. ol. 
v= (yO) 
= ben yi] 


we can find a curve C, namely, 
r(t) =Yo + Vot 


(15) (3(r, — To) — (2vo + va))t? 


Fig. 438. Spline and interpolation polynomial in 
Probs. 10 and 11 (2(r9 — ry) + vo + vy)t?: 
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in components, 


x(f) = x9 + xot + B31 — x9) — (2x9 + x4))t? 


+ (Axo — x1) tx6 + xt? 


y(t) = yo + yot + B(x — yo) — yo + yi)? 


+ (2(yo — ya) + yo + ye. 


Note that this is a cubic Hermite interpolation poly- 
nomial, and n = | because we have two nodes (the 
endpoints of C). (This has nothing to do with the 
Hermite polynomials in Sec. 5.8.) The two points 


Ga: Zo = To + Vo 
= [x9 + x9, Yo + yol 
and 
Gp: $1 =T1- V1 
= [x1 — x11 — yi] 


are called guidepoints because the segments AG and 
BGzg specify the tangents graphically. A, B, Ga, Gg 
determine C, and C can be changed quickly by moving 
the points. A curve consisting of such Hermite 
interpolation polynomials is called a Bezier curve, 
after the French engineer P. Bezier of the Renault 
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Automobile Company, who introduced them in the 
early 1960s in designing car bodies. Bezier curves (and 
surfaces) are used in computer-aided design (CAD) and 
computer-aided manufacturing (CAM). (For more 
details, see Ref. [E21] in App. 1.) 


(b) Find and graph the Bezier curve and _ its 
guidepoints if A:[0,0], B:[1,0], vo = [5.3], 
Vo [=33 V3]. 

(c) Changing guidepoints changes C. Moving guide- 
points farther away results in C “staying near the 
tangents for a longer time.” Confirm this by changing 
Vo and vy in (b) to 2Vo and 2v, (see Fig. 439). 

(d) Make experiments of your own. What happens if 
you change vy, in (b) to —vj. If you rotate the tangents? 
If you multiply vo and v, by positive factors less than 1? 


Fig. 439. Team Project 20(b) and (c): Bezier curves 


19.5 Numeric Integration and Differentiation 


In applications, the engineer often encounters integrals that are very difficult or even 
impossible to solve analytically. For example, the error function, the Fresnel integrals 
(see Probs. 16-25 on nonelementary integrals in this section), and others cannot 
be evaluated by the usual methods of calculus (see App. 3, (24)-(44) for such 
“difficult” integrals). We then need methods from numerical analysis to evaluate such 
integrals. We also need numerics when the integrand of the integral to be evaluated 
consists of an empirical function, where we are given some recorded values of that 
function. Methods that address these kinds of problems are called methods of numeric 


integration. 


Numeric integration means the numeric evaluation of integrals 


b 
J= | F(x) dx 


a 


where a and b are given and fis a function given analytically by a formula or empirically 
by a table of values. Geometrically, J is the area under the curve of f between a and b 
(Fig. 440), taken with a minus sign where f is negative. 
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We know that if fis such that we can find a differentiable function F whose derivative 
is f, then we can evaluate J directly, i.e., without resorting to numeric integration, by 
applying the familiar formula 


b 
i | f(x) dx = F(b) — F(a) [F'@) =f@)1. 


a 


Your CAS (Mathematica, Maple, etc.) or tables of integrals may be helpful for this purpose. 


Rectangular Rule. Trapezoidal Rule 


Numeric integration methods are obtained by approximating the integrand f by functions 
that can easily be integrated. 

The simplest formula, the rectangular rule, is obtained if we subdivide the interval of 
integration a =x Sb into n subintervals of equal length h = (b — a)/n and in each 
subinterval approximate f by the constant / (x7), the value of f at the midpoint xe of the jth 
subinterval (Fig. 441). Then fis approximated by a step function (piecewise constant function), 
the n rectangles in Fig. 441 have the areas f(x7)h,--:, f(x;)h, and the rectangular rule is 


b 
b- 
(1) J= | F(x) dx ~ h[ fet) + fa) + + + fH] ( = ‘). 


n 
a 


The trapezoidal rule is generally more accurate. We obtain it if we take the same 
subdivision as before and approximate f by a broken line of segments (chords) with 
endpoints [a, f(a)], [x1, f™y],---, [b, f(b)] on the curve of f (Fig. 442). Then the area 
under the curve of f between a and b is approximated by n trapezoids of areas 


aL fla) +favlh, alfad+foolh, +,  alf@n-1) +f. 


y=flx) 


a b x 


Fig. 440. Geometric interpretation 
of a definite integral Fig. 441. Rectangular rule 


y 


b x 


2 eee X41 


Fig. 442. Trapezoidal rule 
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EXAMPLE 1 


By taking their sum we obtain the trapezoidal rule 
f 1 1 

(2) J= | F(x) dx ~ h fe) ae a i) 
a 


where h = (b — a)/n, as in (1). The x;’s and a and b are called nodes. 


Trapezoidal Rule 


1 
Evaluate J = | en dx by means of (2) with n = 10. 
0 
Note that this integral cannot be evaluated by elementary calculus, but leads to the error function (see Eq. (35), 


App. 3). 
Solution. J ~ 0.1(0.5 + 1.367879 + 6.778167) = 0.746211 from Table 19.3. B 


Table 19.3 Computations in Example 1 


ij x; xy eu 
0 0 0 1.000000 
1 0.1 0.01 0.990050 
2 0.2 0.04 0.960789 
3 0.3 0.09 0.913931 
4 0.4 0.16 0.852144 
5 0.5 0.25 0.778801 
6 0.6 0.36 0.697676 
7 0.7 0.49 0.612626 
8 0.8 0.64 0.527292 
9 0.9 0.81 0.444858 
10 1.0 1.00 0.367879 
Sums 1.367879 6.778167 


Error Bounds and Estimate for the Trapezoidal Rule 


An error estimate for the trapezoidal rule can be derived from (5) in Sec. 19.3 withn = | 
by integration as follows. For a single subinterval we have 


f' 
2 


F(x) — pie) = (& — xox — x4) 


with a suitable t depending on x, between xg and xj. Integration over x from a = xq to 


X1 =Xo + h gives 


XLoth 


XLoth 
h 
| f(x) dx — [yo + een) = | (x — xo)(x — xo 


Xo Xo 


nb ee dx. 
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Setting x — xq = v and applying the mean value theorem of integral calculus, which we can 
use because (x — x9)(x — x9 — /) does not change sign, we find that the right side equals 


f"@) ( 2 f'@ _ r 
2 3 2 > - i 


h 
(2") | v(v — h) du f'@) 


0 


where 7 is a (suitable, unknown) value between xg and x,. This is the error for the 
trapezoidal rule with n = 1, often called the local error. 

Hence the error € of (2) with any n is the sum of such contributions from the n 
subintervals; since h = (b — a)/n, nh® = n(b — a)?/n?, and (b — a)? = n7h", we obtain 


LUO eS Poe 
(3) ee get a ee 


with (suitable, unknown) 7 between a and b. 
Because of (3) the trapezoidal rule (2) is also written 


b 
1 1 b= WD 
(2*) J= | f(x) dx ~ Hf 5 fe) + fy to + fGn-v + ZF) | — an Wf"). 


a 


Error Bounds are now obtained by taking the largest value for f”, say, Mz, and the 
smallest value, M5, in the interval of integration. Then (3) gives (note that K is negative) 


_@=aF _ _b=a 


h?. 
12n? 12 


(4) KM, S¢ = KMs5 where ik = 


Error Estimation by Halving h is advisable if f” is very complicated or unknown, for 
instance, in the case of experimental data. Then we may apply the Error Principle of 
Sec. 19.1. That is, we calculate by (2), first with h, obtaining, say, J = J, + €p, and then 
with xh, obtaining J = Jn + €nj2- Now if we replace h? in (3) with (5h), the error is 
multiplied by i. Hence €n2 ~ g€n (not exactly because ¢ may differ). Together, 
Jnj2 ae En/2 = Jn ae En ~ Jn a 4€n/2- Thus Jnj2 =, Jn = (4 = l€n2. Division by 3 
gives the error formula for Jj,2 


(5) En/2 ~ 3 VJnyj2 — Sn)- 


Error Estimation for the Trapezoidal Rule by (4) and (5) 
Estimate the error of the approximate value in Example | by (4) and (5). 
Solution. (A) Error bounds by (4). By differentiation, f"(x) = 2(2x7 — em, Also, f’"(x) > 0if0 <x <1, 
so that the minimum and maximum occur at the ends of the interval. We compute My = f”(1) = 0.735759 and 
M3 = f'(O) = —2. Furthermore, K = —1 /1200, and (4) gives 

—0.000614 S e€ = 0.001667. 
Hence the exact value of J must lie between 


0.746211 — 0.000614 = 0.745597 and 0.746211 + 0.001667 = 0.747878. 


Actually, J = 0.746824, exact to 6D. 


SEC. 19.5 Numeric Integration and Differentiation 831 


(B) Error estimate by (5). Jy, = 0.746211 in Example 1. Also, 


19 
Sie 1 
Se G/20¥ + 5 (1 + 0.367879) | = 0.746671, 


Jnj2 = 0.05| 
j=l 


Hence €p/2 = 302 — Jn) = 0.000153 and Jpg + €nj2 = 0.746824, exact to 6D. |_| 


Simpson's Rule of Integration 


Piecewise constant approximation of f led to the rectangular rule (1), piecewise linear 
approximation to the trapezoidal rule (2), and piecewise quadratic approximation will lead 
to Simpson’s rule, which is of great practical importance because it is sufficiently accurate 
for most problems, but still sufficiently simple. 

To derive Simpson’s rule, we divide the interval of integration a = x S 5 into an even 
number of equal subintervals, say, into m = 2m subintervals of length h = (b — a)/(2m), 
with endpoints x9 (= a), X41,°°+,Xam—1,X2m (= b); see Fig. 443. We now take the first 
two subintervals and approximate f(x) in the interval x9 S x S x9 = Xo + 2h by the 
Lagrange polynomial po(x) through (x9, fo), 1.1), 2, fa), where fj = f(x;). From (3) 
in Sec. 19.3 we obtain 


(x — x1)(x — Xa) (x — xo)(x — X2) (x — xo)(® — x1) 
(x9 — xa(Xo —X2) 79 (Ky — XOMX1— X92) 2? (a — XO)(X2 — x) 7” 


(6) pax) = 


The denominators in (6) are 2h, —h, and 2h", respectively. Setting s = (x — x4)/h, we 
have 


xX — xX, = sh, X-—Xgp9 =xX-(4,-A=(S+ Dh 
X—xXg=x-(x4,+h=(s—- Dh 


and we obtain 
pa(x) = $5(s — Dfo — (8 + Is — Df + as + Dope. 


We now integrate with respect to x from x9 to xg. This corresponds to integrating with 
respect to s from —1 to 1. Since dx = h ds, the result is 


= = 1 4 1 
7") | f(x) dx = | P2(x) dx = a( fo + 3 fit 3 fh). 
y First parabola 


Oe a Second parabola 


y = flx) Last parabola 


ss Sig ea a | Xom-2 %2m-1 8 


Fig. 443. Simpson’s rule 
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A similar formula holds for the next two subintervals from x2 to x4, and so on. By summing 
all these m formulas we obtain Simpson’s rule* 


b 
h 
(7) | G9) Ge = a (fo + 4fi + 2f2 + 4f3 +--+ + 2fam—2 + 4fem-1 + fam); 


a 


where h = (b — a)/(2m) and fj = f(x;). Table 19.4 shows an algorithm for Simpson’s 
rule. 


Table 19.4 Simpson’s Rule of Integration 


ALGORITHM SIMPSON (a, b, m, fos fi, °° *» fom) 


This algorithm computes the integral J = i. f(x) dx from given values f; = f(x;) at 
equidistant x9 = a, x1 = Xo + hy-+ ++, Xam, = X9 + 2mh = b by Simpson’s rule (7), 
where h = (b — a)/(2m). 


INPUT: a,b, m, fo,° ++, fom 
OUTPUT: Approximate value J of J 


Compute so = fo + fam 


Sy = fit fs to-+ > fom-1 
Sg = fot fat-++ + fom-2 
h = (b—a)/2m 


7=5 (sg + 454 + 259) 


OUTPUT J. Stop. 
End SIMPSON 


Error of Simpson’s Rule (7). If the fourth derivative f exists and is continuous on 
a =x Sb, the error of (7), call it €;, is 


(b — a) i 
8 Oy = — nef (2): 
(8) Es noe ia ot © 


here 7 is a suitable unknown value between a and b. This is obtained similarly to (3). 
With this we may also write Simpson’s rule (7) as 


b 
h b— - 
a) | foode= 3 Ua + at + fd — Pye HEM 


“THOMAS SIMPSON (1710-1761), self-taught English mathematician, author of several popular textbooks. 


Simpson’s rule was used much earlier by Torricelli, Gregory (in 1668), and Newton (in 1676). 
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EXAMPLE 3 


Error Bounds. By taking for f in (8) the maximum M4 and minimum M§ on the interval 
of integration we obtain from (8) the error bounds (note that C is negative) 


ViGESOPR see 
180(2m)* 180 


(9) CM, = €s =CM4 where C= ht. 


Degree of Precision (DP) of an integration formula. This is the maximum degree of 
arbitrary polynomials for which the formula gives exact values of integrals over any 
intervals. 

Hence for the trapezoidal rule, 


DP = 1 


because we approximate the curve of f by portions of straight lines (linear polynomials). 
For Simpson’s rule we might expect DP = 2 (why?). Actually, 


DP = 3 


by (9) because f rig identically zero for a cubic polynomial. This makes Simpson’s rule 
sufficiently accurate for most practical problems and accounts for its popularity. 


Numeric Stability with respect to rounding is another important property of Simpson’s 
rule. Indeed, for the sum of the roundoff errors €; of the 2m + 1| values f; in (7) we obtain, 
since h = (b — a)/2m, 


h b 
3 leo + er +--+ + €aml S FS 6mu = (b — au 
LIN 


where u is the rounding unit (u = 3 - 107® if we round off to 6D; see Sec. 19.1). Also 
6 = 1+ 4 + 1 is the sum of the coefficients for a pair of intervals in (7); take m = | in 
(7) to see this. The bound (b — a)u is independent of m, so that it cannot increase with 
increasing m, that is, with decreasing h. This proves stability. oH 


Newton-Cotes Formulas. We mention that the trapezoidal and Simpson rules are special 
closed Newton—Cotes formulas, that is, integration formulas in which f(x) is interpolated 
at equally spaced nodes by a polynomial of degree n(n = | for trapezoidal, n = 2 for 
Simpson), and closed means that a and b are nodes (a = xo, b = xy). n = 3 and higher 
n are used occasionally. From n = 8 on, some of the coefficients become negative, so 
that a positive f; could make a negative contribution to an integral, which is absurd. For 
more on this topic see Ref. [E25] in App. 1. 


Simpson’s Rule. Error Estimate 
1 2 

Evaluate J = | e ~ dx by Simpson’s rule with 2m = 10 and estimate the error. 
0 


Solution. Since h = 0.1, Table 19.5 gives 


0.1 
j= cs (1.367879 + 4 - 3.740266 + 2 - 3.037901) = 0.746825. 
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Estimate of error. Differentiation gives fC) = 4(4x4 — 12x? + 3)e"™. By considering the derivative f? 


of FS we find that the largest value of f ® in the interval of integration occurs at 0 and the smallest value at 
x* = (2.5 — 0.5V10)". Computation gives the values M4 = f(0) = 12 and M4 = f(x") = —7.419. Since 
2m = 10 and b — a = 1, we obtain C = —1/1800000 = —0.00000056. Therefore, from (9), 


—0.000007 = e, = 0.000005. 


Hence J must lie between 0.746825 — 0.000007 = 0.746818 and 0.746825 + 0.000005 = 0.746830, so that at 
least four digits of our approximate value are exact. Actually, the value 0.746825 is exact to 5D because 
J = 0.746824 (exact to 6D). 

Thus our result is much better than that in Example | obtained by the trapezoidal rule, whereas the number 
of operations is nearly the same in both cases. a) 


Table 19.5 Computations in Example 3 


di Xj a eu 
0 0 0 1.000000 
1 0.1 0.01 0.990050 
2 0.2 0.04 0.960789 
3 0.3 0.09 0.913931 
4 0.4 0.16 0.852144 
5 0.5, 0.25 0.778801 
6 0.6 0.36 0.697676 
7 0.7 0.49 0.612626 
8 0.8 0.64 0.527292 
9 0.9 0.81 0.444858 

10 1.0 1.00 0.367879 

Sums 1.367879 3.740266 3.037901 


Instead of picking an n = 2m and then estimating the error by (9), as in Example 3, it is 
better to require an accuracy (e.g., 6D) and then determine n = 2m from (9). 


Determination of n = 2m in Simpson’s Rule from the Required Accuracy 
What n should we choose in Example 3 to get 6D-accuracy? 


Solution. Using Mg = 12 (which is bigger in absolute value than M4, we get from (9), with b — a = | and 
the required accuracy, 


12 1 
=—- 1076, 


|cM4| = ——— = 
180(2m) 2 


2+ 108-127" 
9.55. 


thus m | 
180 + 24 


Hence we should choose n = 2m = 20. Do the computation, which parallels that in Example 3. 
Note that the error bounds in (4) or (9) may sometimes be loose, so that in such a case a smaller n = 2m 
may already suffice. | 


Error Estimation for Simpson’s Rule by Halving h. The idea is the same as in (5) 
and gives 


(10) €n/2 ~ is Jny2 — In): 


Jy, is obtained by using / and Jy 2 by using xh, and €p,/2 is the error of Jy/2. 
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EXAMPLE 5 


EXAMPLE 6 


Derivation. In (5) we had 3 as the reciprocal of 3 = 4 — 1 and t = 6)" resulted from 
h® in (3) by replacing h with xh. In (10) we have is as the reciprocal of 15 = 16 — 1 
and ik = (5)* results from h* in (8) by replacing A with xh. 


Error Estimation for Simpson’s Rule by Halving 


4 


Integrate f(x) = 4 Xx” COS 4 ax from 0 to 2 with h = | and apply (10). 


Solution. The exact 5D-value of the integral is J = 1.25953. Simpson’s rule gives 


[f() + 4f(1) + f(2)] = $0 + 4 - 0.555360 + 0) = 0.740480, 
[f(O) + 4fG) + 2f(D) + 4F@ + fOI 
= 4[0 + 4+ 0.045351 + 2 - 0.555361 + 4+ 1.521579 + O] = 1.22974. 


Jn 


1 
3 
1 
Inj2=6 


Hence (10) gives €nj2 = (1.22974 — 0.74048) = 0.032617 and thus J ~ Jpy2 + €n/g = 1.26236, with an 
error —0.00283 which is less in absolute value than ip of the error 0.02979 of Jnz, Hence the use of (10) was 
well worthwhile. | 


Adaptive Integration 


The idea is to adapt step / to the variability of f(x). That is, where f varies but little, we can 
proceed in large steps without causing a substantial error in the integral, but where f varies 
rapidly, we have to take small steps in order to stay everywhere close enough to the curve 
of f. 

Changing h is done systematically, usually by halving h, and automatically (not “by hand”’) 
depending on the size of the (estimated) error over a subinterval. The subinterval is halved 
if the corresponding error is still too large, that is, larger than a given tolerance TOL 
(maximum admissible absolute error), or is not halved if the error is less than or equal to 
TOL (or doubled if the error is very small). 

Adapting is one of the techniques typical of modern software. In connection with 
integration it can be applied to various methods. We explain it here for Simpson’s rule. In 
Table 19.6 an asterisk means that for that subinterval, TOL has been reached. 


Adaptive Integration with Simpson’s Rule 


Integrate f(x) = 4x4 cos 7x from x =0 to 2 by adaptive integration and with Simpson’s rule and 
TOL[O, 2] = 0.0002. 


Solution. Table 19.6 shows the calculations. Figure 444 shows the integrand f(x) and the adapted intervals 
used. The first two intervals ({0, 0.5], [0.5, 1.0]) have length 0.5, hence h = 0.25 [because we use 2m = 2 
subintervals in Simpson’s rule (7**)]. The next two intervals ([1.00, 1.25], [1.25, 1.50]) have length 0.25 
(hence A = 0.125) and the last four intervals have length 0.125. Sample computations. For 0.740480 see 
Example 5. Formula (10) gives (0.123716 — 0.122794)/15 = 0.000061. Note that 0.123716 refers to [0, 0.5] 
and [0.5, 1], so that we must subtract the value corresponding to [0, 1] in the line before. Etc. 
TOL[O, 2] = 0.0002 gives 0.0001 for subintervals of length 1, 0.00005 for length 0.5, etc. The value of the 
integral obtained is the sum of the values marked by an asterisk (for which the error estimate has become 
less than TOL). This gives 


J ~ 0.123716 + 0.528895 + 0.388263 + 0.218483 = 1.25936. 


The exact 5D-value is J = 1.25953. Hence the error is 0.00017. This is about 1/200 of the absolute value of 
that in Example 5. Our more extensive computation has produced a much better result. | 
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Table 19.6 Computations in Example 6 


Interval Integral Error (10) TOL Comment 

[0, 2] 0.740480 0.0002 
[0, 1] 0.122794 
[1, 2] 1,10695_ 

Sum = 1.22974 0.032617 0.0002 Divide further 
[0.0, 0.5] 0.004782 
[0.5, 1.0] 0.118934 

Sum = 0.123716* 0.000061 0.0001 TOL reached 
[1.0, 1.5] 0.528176 
[1.5, 2.0] 0.605821 

Sum = 1.13300 0.001803 0.0001 Divide further 
[1.00, 1.25] 0.200544 
[1.25, 1.50] 0.328351 

Sum = 0.528895* 0.000048 0.00005 TOL reached 
[1.50, 1.75] 0.388235 
[1.75, 2.00] 0.218457 

Sum = 0.606692 0.000058 0.00005 Divide further 
[1.500, 1.625] 0.196244 
[1.625, 1.750] 0.192019 

Sum = 0.388263* 0.000002 0.000025 TOL reached 
[1.750, 1.875] 0.153405 
[1.875, 2.000] 0.065078 

Sum = 0.218483* 0.000002 0.000025 TOL reached 


1 


e 
0) 0.5 
Fig. 444. Adaptive integration in Example 6 


Gauss Integration Formulas 
Maximum Degree of Precision 


Our integration formulas discussed so far use function values at predetermined 
(equidistant) x-values (nodes) and give exact results for polynomials not exceeding a 
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certain degree [called the degree of precision; see after (9)]. But we can get much more 
accurate integration formulas as follows. We set 


1 n 
(11) | f(@ dt ~ SAF Lf = f(t] 
fal 


il 


with fixed n, and t = +1 obtained from x = a, b by setting x = 5 [a(t —1)+bd¢+ IJ. 
Then we determine the n coefficients Aj,---,A, and n nodes fy,---, t,, so that (11) gives 
exact results for polynomials of degree k as high as possible. Since n + n = 2n is the 
number of coefficients of a polynomial of degree 2n — 1, it follows that k S 2n —1. 

Gauss has shown that exactness for polynomials of degree not exceeding 2 — 1 (instead 
of n — 1 for predetermined nodes) can be attained, and he has given the location of the 
t;(= the jth zero of the Legendre polynomial P, in Sec. 5.3) and the coefficients A; which 
depend on n but not on f(2), and are obtained by using Lagrange’s interpolation polynomial, 
as shown in Ref. [E5] listed in App. 1. With these t; and A;, formula (11) is called a Gauss 
integration formula or Gauss quadrature formula. Its degree of precision is 2n — 1, as 
just explained. Table 19.7 gives the values needed for n = 2,---,5. (For larger n, see 
pp. 916-919 of Ref. [GenRef1] in App. 1.) 


Table 19.7 Gauss Integration: Nodes t, and Coefficients A; 


n Nodes ¢; Coefficients A; Degree of Precision 
> —0.5773502692 1 3 
0.5773502692 1 
—0.7745966692 0.5555555556 
3 0 0.8888888889 5 
0.7745966692 0.5555555556 
—0.8611363116 0.347854845 | 
4 —0.33998 10436 0.6521451549 F 
0.33998 10436 0.652 1451549 
0.8611363116 0.347854845 | 
—0.9061798459 0.236926885 | 
—0.5384693101 0.4786286705 
=) 0 0.5688888889 9 
0.5384693101 0.4786286705 
0.9061798459 0.236926885 | 


EXAMPLE 7 _ Gauss Integration Formula with n = 3 


Evaluate the integral in Example 3 by the Gauss integration formula (11) with n = 3. 


Solution. We have to convert our integral from 0 to | into an integral from —1 to 1. We set x = AG + 1). 
Then dx = 5 dt, and (11) with n = 3 and the above values of the nodes and the coefficients yields 
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1 1 
2 1 I 2 
{ exp(-)ax=1 | exp (-e+n a 
0 2 Jy 4 


1 {5 1 3 : 8 ( ‘) 5 ( “( 2)’ 
~ 1 14 74681 
2 E exp ( Al V3) ge. ay ook A 5 nee 


(exact to 6D: 0.746825), which is almost as accurate as the Simpson result obtained in Example 3 with a much 
larger number of arithmetic operations. With 3 function values (as in this example) and Simpson’s rule we would 
get 3 (1 + 4e~°? + e~1) = 0.747180, with an error over 30 times that of the Gauss integration. fe] 


Gauss Integration Formula with n = 4 and 5 


4 


Integrate f(x) = darx cos 4 ax from x = 0 to 2 by Gauss. Compare with the adaptive integration in Example 6 


and comment. 


Solution. x =t + | gives f(t) = Aart + 1)* cos Ga (t + 1)), as needed in (11). For n = 4 we calculate (6S) 


J~Ayfi + +++ + Agfa = Ai(fi + fa) + Ao(fe + fa) 
= 0.347855(0.000290309 + 1.02570) + 0.652145(0.129464 + 1.25459) = 1.25950. 


The error is 0.00003 because J = 1.25953 (6S). Calculating with 10S and n = 4 gives the same result; so the 
error is due to the formula, not rounding. For n = 5 and 10S we get J ~ 1.259526185, too large by the amount 
0.000000250 because J = 1.259525935 (10S). The accuracy is impressive, particularly if we compare the amount 
of work with that in Example 6. ie] 


Gauss integration is of considerable practical importance. Whenever the integrand / is 
given by a formula (not just by a table of numbers) or when experimental measurements 
can be set at times ¢; (or whatever ¢ represents) shown in Table 19.7 or in Ref. [GenRef1], 
then the great accuracy of Gauss integration outweighs the disadvantage of the complicated 
t; and A; (which may have to be stored). Also, Gauss coefficients A; are positive for all 
n, in contrast with some of the Newton—Cotes coefficients for larger n. 

Of course, there are frequent applications with equally spaced nodes, so that Gauss 
integration does not apply (or has no great advantage if one first has to get the f; in (11) 
by interpolation). 

Since the endpoints —1 and | of the interval of integration in (11) are not zeros of Py, 
they do not occur among f,--::,f,, and the Gauss formula (11) is called, therefore, an 
open formula, in contrast with a closed formula, in which the endpoints of the interval 
of integration are fo and ¢,,. [For example, (2) and (7) are closed formulas. ] 


Numeric Differentiation 


Numeric differentiation is the computation of values of the derivative of a function f from 
given values of f. Numeric differentiation should be avoided whenever possible. Whereas 
integration is a smoothing process and is not very sensitive to small inaccuracies in function 
values, differentiation tends to make matters rough and generally gives values of f’ that are 
much less accurate than those of f. The difficulty with differentiation is tied in with the 
definition of the derivative, which is the limit of the difference quotient, and, in that quotient, 
you usually have the difference of a large quantity divided by a small quantity. This can 
cause numerical instability. While being aware of this caveat, we must still develop basic 
differentiation formulas for use in numeric solutions of differential equations. 

We use the notations f =f (xj), fi =f "(xj), etc., and may obtain rough approximation 
formulas for derivatives by remembering that 


f+ A) FO) 


, ‘i 
f@ = jim ‘a 
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This suggests 


6 — 
(12) fin =~ 2 AD 


Similarly, for the second derivative we obtain 


o —2fi+ 
(13) fx +28 - Ls 


etc. 


More accurate approximations are obtained by differentiating suitable Lagrange 
polynomials. Differentiating (6) and remembering that the denominators in (6) are 2h?, 
—h?, 2h, we have 


! ! 2x — 
ff) * po) = 


Evaluating this at x9, x1, Xx2, we obtain the “three-point formulas” 


@) fi ~ 5-3 + 4h — fi. 
1 
' 1 
(14) ) fl ~ 5; (fo + A) 


, 1 
(c) fo = Hh (fo — 4f1 + 3/2). 


Applying the same idea to the Lagrange polynomial p4(x), we obtain similar formulas, 
in particular, 


fo — 8fi + 8f3 — fa). 


, 
(15) Sa ~ Tp 


Some examples and further formulas are included in the problem set as well as in 
Ref. [E5] listed in App. 1. 


PROBEEM—SET 19-5 


1-6 


1. Rectangular rule. Evaluate the integral in Example 


RECTANGULAR AND TRAPEZOIDAL RULES 3. Trapezoidal rule. To get a feel for increase in accuracy, 


integrate x” from 0 to 1 by (2) withh = 1, 0.5, 0.25, 0.1. 


1 by the rectangular rule (1) with subintervals of 4. Error estimation by halfing. Integrate f(x) = x* from 
length 0.1. Compare with Example 1. (6S-exact: 0 to 1 by (2) with h = 1,h = 0.5, h = 0.25 and esti- 
0.746824) mate the error for h = 0.5 and h = 0.25 by (5). 


2. Bounds for (1). Derive a formula for lower and upper 5. Error estimation. Do the tasks in Prob. 4 for 
bounds for the rectangular rule. Apply it to Prob. 1. f@) = sin 377K. 
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6. Stability. Prove that the trapezoidal rule is stable with 
respect to rounding. 


SIMPSON’S RULE 
2 
dx 


Evaluate the integrals A = | pr B 
1 0 


ll 
>— 
i) 
» 
= 
iss) 
R 
= 


1 
d. 
I= | z 5 by Simpson’s rule with 2m as indicated, 
1 


Pix 
and pee with the exact value known from calculus. 
7. A,2m=4 8. A, 2m = 10 
9. B,2m =4 10. B, 2m = 10 
11. J, 2m = 4 12. J,2m = 10 


13. Error estimate. Compute the integral J by Simpson’s 
rule with 2m = 8 and use the value and that in Prob. 
11 to estimate the error by (10). 

14. Error bounds and estimate. Integrate e~” from 0 to 2 
by (7) with h = 1 and with h = 0.5. Give error bounds 
for the h = 0.5 value and an error estimate by (10). 

15. Given TOL. Find the smallest n in computing A (see 


Probs. 7 and 8) such that 5S-accuracy is guaranteed 
(a) by (4) in the use of (2), (b) by (9) in the use of (7). 


16-21) NONELEMENTARY INTEGRALS 


The following integrals cannot be evaluated by the usual 
methods of calculus. Evaluate them as indicated. Compare 
your value with that possibly given by your CAS. Si(x) is 
the sine integral. S(x) and C(x) are the Fresnel integrals. 
See App. A3.1. They occur in optics. 


” gin x 
Six) = | = dx*, 
b * 


x x 

S(x) = | sin (x*’) dx*, C(x) = | cos (x*’) dx* 
0 0 

16. Si(1) by (2), n = 5,n = 10, and apply (5). 

17. Si(1) by (7), 2m = 2,2m = 4 

18. Obtain a better value in Prob. 17. Hint. Use (10). 

19. Si(1) by (7), 2m = 10 

20. S(1.25) by (7), 2m = 10 

21. C(1.25) by (7), 2m = 10 


GAUSS INTEGRATION 


Integrate by (11) with n = 5: 
22. cos x from 0 to har 

23. xe~” from 0 to 1 

24. sin (x?) from 0 to 1.25 
25. exp (—x?) from 0 to 1 


26. TEAM PROJECT. Romberg Integration (W. Rom- 
berg, Norske Videnskab. Trondheim, Forh. 28, Nr. 7, 
1955). This method uses the trapezoidal rule and gains 
precision stepwise by halving / and adding an error 
estimate. Do this for the integral of f(x) = e~” from 
x = 0 to x = 2 with TOL = 107%, as follows. 

Step 1. Apply the trapezoidal rule (2) with h = 2 
(hence n = 1) to get an approximation Jj. Halve h 
and use (2) to get Jo, and an error estimate 


1 
€21 = = Vai — Ji). 
27 = 1 


If |€g,| S TOL, stop. The result is Jog = Joy + €9. 

Step 2. Show that €9; = —0.066596, hence 
|€5;| > TOL and go on. Use (2) with h/4 to get J3, 
and add to it the error estimate €3, = 4 (Jay — Jg3) to 
get the better J3g = J3, + €31. Calculate 


1 1 
€32 ~ 54 _ 1 (J32 — Jez) = 15 (J32. — Jog). 


2 
If |€39| S TOL, stop. The result is Jgg = Jg2 + €39. 
(Why does 2* = 16 come in?) Show that we obtain 
€32 = —0.000266, so that we can stop. Arrange your 
J- and e-values in a kind of “difference table.” 


Jy, 
ae 
Ip, <a “5 
Te SS 
J31 J32 33 


If |€39| were greater than TOL, you would have to 
go on and calculate in the next step Ja, from (2) with 


h= i. then 
Jaz = Jai + €41 with €a1 = 3(Ja1 — J31) 
Ja3 = Jag + €42 with €42 = 75 (Jaz — J32) 
Jaa = Jaz + €43 with €43 = 63 Jaz — J33) 


where 63 = 2° — 1. (How does this come in?) 

Apply the Romberg method to the integral 
of f(x) = 47x" cos 47x from x=0 to 2 with 
TOL = 107%. 


27-30| DIFFERENTIATION 

27. Consider f(x) = x* for Xo = 0,x1 = 0.2, x9 = 0.4, 
x3 = 0.6,x4 = 0.8. Calculate f, from (14a), (14b), 
(14c), (15). Determine the errors. Compare and 
comment. 


Chapter 19 Review Questions and Problems 


28. A “four-point formula” for the derivative is 


/ od 
~s (—2f, — 3f2 + Of3 — fa). 
1 


Apply it to f(x) = x* with x1,°--,x4 as in Prob. 27, 
determine the error, and compare it with that in the case 
of (15). 

29. The derivative f(x) can also be approximated in 
terms of first-order and higher order differences (see 
Sec. 19.3): 


1. What is a numeric method? How has the computer 
influenced numerics? 

2. What is an error? A relative error? An error bound? 

3. Why are roundoff errors important? State the rounding 
rules. 

4. What is an algorithm? Which of its properties are 
important in software implementation? 

5. What do you know about stability? 

6. Why is the selection of a good method at least as 
important on a large computer as it is on a small one? 

7. Can the Newton (—Raphson) method diverge? Is it fast? 
Same questions for the bisection method. 

8. What is fixed-point iteration? 

9. What is the advantage of Newton’s interpolation 
formulas over Lagrange’s? 

10. What is spline interpolation? Its advantage over 
polynomial interpolation? 

11. List and compare the integration methods we have 
discussed. 

12. How did we use an interpolation polynomial in deriving 
Simpson’s rule? 

13. What is adaptive integration? Why is it useful? 

14. In what sense is Gauss integration optimal? 

15. How did we obtain formulas for numeric differentiation? 

16. Write —46.9028104, 0.000317399, 54/7, —890/3 in 
floating-point form with 5S (5 significant digits, 
properly rounded). 

17. Compute (5.346 — 3.644)/(3.444 — 3.055) as given 
and then rounded stepwise to 3S, 2S, 1S. Comment. 
(“Stepwise” means rounding the rounded numbers, not 
the given ones.) 

18. Compute 0.38755/(5.6815 — 0.38419) as given and 
then rounded stepwise to 4S, 3S, 2S, 1S. Comment. 

19. Let 19.1 and 25.84 be correctly rounded. Find the 
shortest interval in which the sum s of the true 
(unrounded) numbers must lie. 
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, 1 1 
f (Xo) =i (af — 5A‘ 
1 1 
t 3 fo 2 A’fo t ), 


Compute f’(0.4) in Prob. 27 from this formula, using 
differences up to and including first order, second 
order, third order, fourth order. 


30. Derive the formula in Prob. 29 from (14) in Sec. 19.3. 


CHAPTER 19 REVIEW QUESTIONS AND PROBLEMS 


20. Do the same task as in Prob. 19 for the difference 
3.2 — 6.29. 


21. What is the relative error of nd in terms of that of a? 

22. Show that the relative error of @ is about twice that 
of a. 

23. Solve x? — 40x + 2 = 0 in two ways (cf. Sec. 19.1). 
Use 4S-arithmetic. 

24. Solve x? — 100x + 1 = 0. Use 5S-arithmetic. 


25. Compute the solution of x* = x + 0.1 near x = 0 by 
transforming the equation algebraically to the form 
x = g(x) and starting from x9 = 0. 

26. Solve cos x = x” by Newton’s method, starting from 
x = 05. 


27. Solve Prob. 25 by bisection (3S-accuracy). 


28. Compute sinh0.4 from sinh0O, sinh 0.5 = 0.521, 
sinh 1.0 = 1.175 by quadratic interpolation. 


29. Find the cubic spline for the data f(0) = 0, f(1) = 0, 
f(2) = 4, ko I,kz = 5. 

30. Find the cubic spline g and the interpolation polynomial 
p for the data (0, 0), (1, 1), (2, 6), (3, 10), with 
q'(0) = 0, q'(3) = 0 and graph p and q on common 
axes. 


31. Compute the integral of x? from 0 to 1 by the 
trapezoidal rule with n = 5. What error bounds are 
obtained from (4) in Sec. 19.5? What is the actual error 
of the result? 


32. Compute the integral of cos (x?) from 0 to 1 by 
Simpson’s rule with 2m = 4. 


33. Solve Prob. 32 by Gauss integration with n = 3 and 
n= 5. 

34. Compute f’(0.2) for f(x) = x? using (14b) in Sec. 19.5 
with (a) h = 0.2, (b) h = 0.1. Compare the accuracy. 


35. Compute f”(0.2) for f(x) = x? using (13) in Sec. 19.5 
with (a) h = 0.2, (b) h = 0.1. 
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SUMMARY-OF-CHAPTER-19 


Numerics in General 


In this chapter we discussed concepts that are relevant throughout numeric work as 
a whole and methods of a general nature, as opposed to methods for linear algebra 
(Chap. 20) or differential equations (Chap. 21). 

In scientific computations we use the floating-point representation of numbers 
(Sec. 19.1); fixed-point representation is less suitable in most cases. 

Numeric methods give approximate values a of quantities. The error € of a is 


(1) €=a-a (Sec. 19.1) 


where a is the exact value. The relative error of @ is €/a. Errors arise from rounding, 
inaccuracy of measured values, truncation (that is, replacement of integrals by sums, 
series by partial sums), and so on. 

An algorithm is called numerically stable if small changes in the initial data give 
only correspondingly small changes in the final results. Unstable algorithms are 
generally useless because errors may become so large that results will be very 
inaccurate. The numeric instability of algorithms must not be confused with the 
mathematical instability of problems (“ill-conditioned problems,” Sec. 19.2). 

Fixed-point iteration is a method for solving equations f(x) = 0 in which the 
equation is first transformed algebraically to x = g(x), an initial guess x9 for the 
solution is made, and then approximations x4, x2,---, are successively computed 
by iteration from (see Sec. 19.2) 


(2) Xn+1 = &(Xn) (n = 0, 1,-+-). 


Newton’s method for solving equations f(x) = 0 is an iteration 


fn) 
iy) 


(3) Xn4+1=Xn - (Sec. 19.2). 


Here x,,+1 is the x-intercept of the tangent of the curve y = f(x) at the point xy. 
This method is of second order (Theorem 2, Sec. 19.2). If we replace f’ in (3) by 
a difference quotient (geometrically: we replace the tangent by a secant), we obtain 
the secant method; see (10) in Sec. 19.2. For the bisection method (which converges 
slowly) and the method of false position, see Problem Set 19.2. 

Polynomial interpolation means the determination of a polynomial p,,(x) such 
that py(xj) = fj, where j = 0,---,n and (x0, fo),°**,(n, fn) are measured or 
observed values, values of a function, etc. p,,(x) is called an interpolation polynomial. 
For given data, p,,(x) of degree n (or less) is unique. However, it can be written in 
different forms, notably in Lagrange’s form (4), Sec. 19.3, or in Newton’s divided 
difference form (10), Sec. 19.3, which requires fewer operations. For regularly 
spaced x9, X¥1 = X9 + h,+++, Xn = Xo + nh the latter becomes Newton’s forward 
difference formula (formula (14) in Sec. 19.3): 


Summary of Chapter 19 
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rr—lee(r-nt+)., 
(4) FS) = Prlx) = fo + r Afo +--+ i Af 
where r = (x — Xo)/h and the forward differences are Af; = fj+1 — fj and 
ANG = AP far — APF (ES 2 Sans 


A similar formula is Newton’s backward difference interpolation formula (formula 
(18) in Sec. 19.3). 

Interpolation polynomials may become numerically unstable as n increases, and 
instead of interpolating and approximating by a single high-degree polynomial it is 
preferable to use a cubic spline g(x), that is, a twice continuously differentiable 
interpolation function [thus, g(x;) = fj], which in each subinterval x; S x S xj41 
consists of a cubic polynomial g;(x); see Sec. 19.4. 


Simpson’s rule of numeric integration is [see (7), Sec. 19.5] 


b 
h 
(5) | FQ) dx = 3 (fo + 4fi + fa + 4fg + °° + 2fam—2 + Afam—1 + fom) 


a 


with equally spaced nodes x; = x9 + jh,j = 1,--+,2m,h = (b — a)/(2m), and 
Sj = f(x;). It is simple but accurate enough for many applications. Its degree of 
precision is DP = 3 because the error (8), Sec. 19.5, involves h*. A more practical 
error estimate is (10), Sec. 19.5, 


En/2 = 15 Jaye — Jn), 


obtained by first computing with step h, then with step 4/2, and then taking i of 
the difference of the results. 

Simpson’s rule is the most important of the Newton—Cotes formulas, which are 
obtained by integrating Lagrange interpolation polynomials, linear ones for the 
trapezoidal rule (2), Sec. 19.5, quadratic for Simpson’s rule, cubic for the three- 
eights rule (see the Chap. 19 Review Problems), etc. 


Adaptive integration (Sec. 19.5, Example 6) is integration that adjusts 
(“adapts”’) the step (automatically) to the variability of f(x). 


Romberg integration (Team Project 26, Problem Set 19.5) starts from the 
trapezoidal rule (2), Sec. 19.5, with h,h/2,h/4, etc. and improves results by 
systematically adding error estimates. 


Gauss integration (11), Sec. 19.5, is important because of its great accuracy 
(DP = 2n — 1, compared to Newton—Cotes’s DP = n — 1 or n). This is achieved 
by an optimal choice of the nodes, which are not equally spaced; see Table 19.7, 
Sec. 19.5. 


Numeric differentiation is discussed at the end of Sec. 19.5. (Its main application 
(to differential equations) follows in Chap. 21.) 


CHAPTER 2 O 


Numeric Linear Algebra 


This chapter deals with two main topics. The first topic is how to solve linear systems of 
equations numerically. We start with Gauss elimination, which may be familiar to some 
readers, but this time in an algorithmic setting with partial pivoting. Variants of this method 
(Doolittle, Crout, Cholesky, Gauss—Jordan) are discussed in Sec. 20.2. All these methods 
are direct methods, that is, methods of numerics where we know in advance how many 
steps they will take until they arrive at a solution. However, small pivots and roundoff 
error magnification may produce nonsensical results, such as in the Gauss method. A shift 
occurs in Sec. 20.3, where we discuss numeric iteration methods or indirect methods to 
address our first topic. Here we cannot be totally sure how many steps will be needed to 
arrive at a good answer. Several factors—such as how far is the starting value from our 
initial solution, how is the problem structure influencing speed of convergence, how 
accurate would we like our result to be—determine the outcome of these methods. 
Moreover, our computation cycle may not converge. Gauss-Seidel iteration and Jacobi 
iteration are discussed in Sec. 20.3. Section 20.4 is at the heart of addressing the pitfalls 
of numeric linear algebra. It is concerned with problems that are ill-conditioned. We learn 
to estimate how “bad” such a problem is by calculating the condition number of its matrix. 

The second topic (Secs. 20.6—20.9) is how to solve eigenvalue problems numerically. 
Eigenvalue problems appear throughout engineering, physics, mathematics, economics, 
and many areas. For large or very large matrices, determining the eigenvalues is difficult 
as it involves finding the roots of the characteristic equations, which are high-degree 
polynomials. As such, there are different approaches to tackling this problem. Some 
methods, such as Gerschgorin’s method and Collatz’s method only provide a range in 
which eigenvalues lie and thus are known as inclusion methods. Others such as 
tridiagonalization and QR-factorization actually find all the eigenvalues. The area is quite 
ingeneous and should be fascinating to the reader. 


COMMENT. This chapter is independent of Chap. 19 and can be studied immediately 
after Chap. 7 or 8. 


Prerequisite: Secs. 7.1, 7.2, 8.1. 
Sections that may be omitted in a shorter course: 20.4, 20.5, 20.9. 
References and Answers to Problems: App. | Part E, App. 2. 
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The basic method for solving systems of linear equations by Gauss elimination and back 
substitution was explained in Sec. 7.3. If you covered Sec. 7.3, you may wonder why we 
cover Gauss elimination again. The reason is that here we cover Gauss elimination in the 
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setting of numerics and introduce new material such as pivoting, row scaling, and operation 
count. Furthermore, we give an algorithmic representation of Gauss elimination in Table 20.1 
that can be readily converted into software. We also show when Gauss elimination runs 
into difficulties with small pivots and what to do about it. The reader should pay close 
attention to the material as variants of Gauss elimination are covered in Sec. 20.2 and, 
furthermore, the general problem of solving linear systems is the focus of the first half of 
this chapter. 

A linear system of m equations in n unknowns x1,°:+,X, is a set of equations 
E,,:::, E, of the form 


Ey: ay4X, tise + AnxXn = by 

Eo: a941%4 cpy See ter danXn = bo 
(1) 

E,: Gnixt Ho e-* AF GyyXy = dy, 


where the coefficients a;, and the b; are given numbers. The system is called homogeneous 
if all the b; are zero; otherwise it is called nonhomogeneous. Using matrix multiplication 
(Sec. 7.2), we can write (1) as a single vector equation 


(2) Ax =b 


where the coefficient matrix A = [a;),] is the n X n matrix 


41 2 ** ayn Xy by 
da, dog *** dan : 

A= », and x=]: and b= 
Qn1i An2 is ann Xn bn 


are column vectors. The following matrix A is called the augmented matrix of the 


system (1): 
ayy ay, dy 
- a21 dg be 
A=[A b]= 
ani ann by 
A solution of (1) is a set of numbers x4,---,x, that satisfy all the n equations, and a 


solution vector of (1) is a vector x whose components constitute a solution of (1). 

The method of solving such a system by determinants (Cramer’s rule in Sec. 7.7) is 
not practical, even with efficient methods for evaluating the determinants. 

A practical method for the solution of a linear system is the so-called Gauss elimination, 
which we shall now discuss (proceeding independently of Sec. 7.3). 


846 


EXAMPLE 1 


CHAP. 20 Numeric Linear Algebra 


Gauss Elimination 


This standard method for solving linear systems (1) is a systematic process of elimination 
that reduces (1) to triangular form because the system can then be easily solved by back 
substitution. For instance, a triangular system is 


3x1 + 5x9e + 2x3 = 8 
8x2 ae 2x3 =—7 
6x3 = 3 


and back substitution gives x3 = 2 = 5 from the third equation, then 


x2 = 3(-7 — 2x3) = —1 


from the second equation, and finally from the first equation 


x1 = (8 — 5xq — 2x3) = 4. 


How do we reduce a given system (1) to triangular form? In the first step we eliminate 
x, from equations Eg to E,, in (1). We do this by adding (or subtracting) suitable multi- 
ples of E; to (from) equations Eg,---,E, and taking the resulting equations, call them 
ES,---, E* as the new equations. The first equation, Ej, is called the pivot equation in 
this step, and ay is called the pivot. This equation is left unaltered. In the second step 
we take the new second equation E3 (which no longer contains x,) as the pivot equation 
and use it to eliminate xz from E3 to Ey. And so on. After n — 1 steps this gives a 
triangular system that can be solved by back substitution as just shown. In this way we 
obtain precisely all solutions of the given system (as proved in Sec. 7.3). 

The pivot a, (in step k) must be different from zero and should be large in absolute 
value to avoid roundoff magnification by the multiplication in the elimination. For this 
we choose as our pivot equation one that has the absolutely largest aj, in column k on or 
below the main diagonal (actually, the uppermost if there are several such equations). This 
popular method is called partial pivoting. It is used in CASs (e.g., in Maple). 

Partial pivoting distinguishes it from total pivoting, which involves both row and 
column interchanges but is hardly used in practice. 

Let us illustrate this method with a simple example. 


Gauss Elimination. Partial Pivoting 
Solve the system 
Ey: 8xo Te 2x3 =-7 


Eo: 3x1 + 5x9 + 2x3 = 8 


Es: 6x, + 2x9 + 8x3 = 26. 


Solution. We must pivot since E, has no x,-term. In Column 1, equation E3 has the largest coefficient. 
Hence we interchange E, and E3, 


6x1 2x9 8x3 26 


3x4 5x2 2x3 8 


8x2 a 2x3 = —-7. 
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Step 1. Elimination of x, 
It would suffice to show the augmented matrix and operate on it. We show both the equations and the augmented 
matrix. In the first step, the first equation is the pivot equation. Thus 


Pivot 6 > 2xy + 8x3 = 26 [6 2 81 26] 
1VO} Xo XZ 

Eliminate >[3x,|+ 5x9 + 2x3 = 8 s A 2. Ss 
| 

8x2 + 2x3 = —7 [0 8 2 =F 


To eliminate x, from the other equations (here, from the second equation), do: 
Subtract 3 = 3 times the pivot equation from the second equation. 


The result is 


6x1 + 2x9 + 8x3 = 26 [6 2 8 ! 26 | 
4x5 — 2x3 = —5 ae 
| 
8x9 + 2x3 = —-7 | 0 8 2! =7] 


Step 2. Elimination of x2 
The largest coefficient in Column 2 is 8. Hence we take the new third equation as the pivot equation, interchanging 
equations 2 and 3, 


ll 
i) 
on 
a 
i) 
oo 


6x1, + 2x9 + 8x3 | 
Pivot 8 => @xg)+ 2x3 = —7 0 8 2 ! _7 
| 
Eliminate > ee i 5 


To eliminate xg from the third equation, do: 


| 

| 
n 
Oo 
a 

| 
i) 


Subtract 5 times the pivot equation from the third equation. 
The resulting triangular system is shown below. This is the end of the forward elimination. Now comes the back 
substitution. 


Back substitution. Determination of x3, x2, x1 
The triangular system obtained in Step 2 is 


6x1 + 2x2 + 8x3 = 26 6 2 8 26 
8x9 + 2x3 =—7 0 8 24-7 

| 
— 3x3 = -3 10 0 -31-2| 


From this system, taking the last equation, then the second equation, and finally the first equation, we compute 
the solution 


x3 = 2 
ta = g(—7 — 2x3) = —1 
x1 = $(26 — 2x — 8x3) = 4. 
This agrees with the values given above, before the beginning of the example. @ 


The general algorithm for the Gauss elimination is shown in Table 20.1. To help explain 
the algorithm, we have numbered some of its lines. b; is denoted by aj,,+1, for uniformity. 
In lines 1 and 2 we look for a possible pivot. [For k = 1 we can always find one; otherwise 
x, would not occur in (1).] In line 2 we do pivoting if necessary, picking an aj, of greatest 
absolute value (the one with the smallest 7 if there are several) and interchange the 
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corresponding rows. If |a,;,| is greatest, we do no pivoting. mj x, in line 4 suggests 
multiplier, since these are the factors by which we have to multiply the pivot equation Ex, 
in Step k before subtracting it from an equation E; below Ej, from which we want to 
eliminate x;,. Here we have written Ej, and E; to indicate that after Step 1 these are no 
longer the equations given in (1), but these underwent a change in each step, as indicated 
in line 5. Accordingly, aj, etc. in all lines refer to the most recent equations, and j 2 k 
in line 1 indicates that we leave untouched all the equations that have served as pivot 
equations in previous steps. For p = k in line 5 we get 0 on the right, as it should be in 
the elimination, 


_ ds 0 
a; MN jp aj a 5 
‘ik gk¢kk ik kk kk 


In line 3, if the last equation in the triangular system is 0 = by, # 0, we have no 
solution. If it is 0 = b}, = 0, we have no unique solution because we then have fewer 
equations than unknowns. 


Gauss Elimination in Table 20.1, Sample Computation 


In Example 1 we had a;; = 0, so that pivoting was necessary. The greatest coefficient in Column | was a3. 
Thus j = 3 in line 2, and we interchanged E, and E3. Then in lines 4 and 5 we computed my, = 3 = A and 


d99 = 5 -—3°2=4, a3 =2-5°8=-2, ay =8—4%-26=—-5, 


and then m3, = a = 0, so that the third equation 8xy + 2x3 = —7 did not change in Step 1. In Step 2 (k = 2) 
we had 8 as the greatest coefficient in Column 2, hence j = 3. We interchanged equations 2 and 3, computed 
M32 = -% = -3 in line 5, and the a33 2 3 aD 3, agq 5 3( 7) 3. This produced the 


triangular form used in the back substitution. i 


If a, = 0 in Step k, we must pivot. If |aj,,| is small, we should pivot because of roundoff 
error magnification that may seriously affect accuracy or even produce nonsensical 
results. 


Difficulty with Small Pivots 


The solution of the system 


0.0004x, + 1.402x2 = 1.406 
0.4003x1 — 1.502x = 2.501 


is x; = 10, xg = 1. We solve this system by the Gauss elimination, using four-digit floating-point arithmetic. 
(4D is for simplicity. Make an 8D-arithmetic example that shows the same.) 
(a) Picking the first of the given equations as the pivot equation, we have to multiply this equation by 
m = 0.4003/0.0004 = 1001 and subtract the result from the second equation, obtaining 
—1405x2 = — 1404. 
Hence x2 = —1404/(—1405) = 0.9993, and from the first equation, instead of x; = 10, we get 


1 0.005 
xy= (1.406 — 1.402 - 0.9993) = = 
0.0004 0.0004 


This failure occurs because |a,| is small compared with |aj9|, so that a small roundoff error in x2 leads to a 


large error in x1. 
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(b) Picking the second of the given equations as the pivot equation, we have to multiply this equation by 
0.0004/0.4003 = 0.0009993 and subtract the result from the first equation, obtaining 
1.404x2 = 1.404. 


Hence x2 = 1, and from the pivot equation x, = 10. This success occurs because |ag;| is not very small 
compared to |ag9|, so that a small roundoff error in x2 would not lead to a large error in x1. Indeed, for 
instance, if we had the value x2 = 1.002, we would still have from the pivot equation the good value 
x1 = (2.501 + 1.505)/0.4003 = 10.01. a 


Table 20.1 Gauss Elimination 


ALGORITHM GAUSS (A = [aj] = [A b]) 


This algorithm computes a unique solution x = [x;] of the system (1) or indicates that 
(1) has no unique solution. 
INPUT: Augmented n X (n + 1) matrix A = [ajx], where aj n+1 = b; 


OUTPUT: Solution x = [x;] of (1) or message that the system (1) has no 
unique solution 


For k = 1,:--+,n — 1, do: 
1 m=k 
Forj =k + 1,-+-++,n, do: 
If (lame < axl) then m = j 
End 
If a,,;, = 0 then OUTPUT “No unique solution exists” 
Stop 


[Procedure completed unsuccessfully] 
Else exchange row k and row m 
If dy, = 0 then OUTPUT “No unique solution exists.” 


Stop 
Else 
4 Forj =k + 1,--:+,n, do: 
Gk 
MN jk: = kk 
5 Forp =k +1,:-+:,n + 1, do: 
Qjp: ~ Ajp ~ Mir Akp 
End 
End 
End 
Annt+1 
6 Xn = —G [Start back substitution] 
TUN 
Fori=n-—1,---,1,do 
1 n 
7 x4 = Hains = » aes) 
j=itl 
End 


OUTPUT x = [x;]. Stop 
End GAUSS 


850 


CHAP. 20 Numeric Linear Algebra 


Error estimates for the Gauss elimination are discussed in Ref. [E5] listed in App. 1. 


Row scaling means the multiplication of each Row j by a suitable scaling factor sj. It is 
done in connection with partial pivoting to get more accurate solutions. Despite much 
research (see Refs. [E9], [E24] in App. 1) and the proposition of several principles, scaling 
is still not well understood. As a possibility, one can scale for pivot choice only (not in 
the calculation, to avoid additional roundoff) and take as first pivot the entry a; for which 
laj1| / |Aj| is largest; here A; is an entry of largest absolute value in Row j. Similarly in 
the further steps of the Gauss elimination. 
For instance, for the system 


4.0000x; + 14020x2g = 14060 
0.4003x 1 — 1.502x2g = 2.501 


we might pick 4 as pivot, but dividing the first equation by 10* gives the system in 
Example 3, for which the second equation is a better pivot equation. 


Operation Count 
Quite generally, important factors in judging the quality of a numeric method are 


Amount of storage 
Amount of time (= number of operations) 


Effect of roundoff error 


For the Gauss elimination, the operation count for a full matrix (a matrix with relatively 
many nonzero entries) is as follows. In Step k we eliminate x, from n — k equations. 
This needs n — k divisions in computing the mj, (line 3) and (n — k)(n — k + 1) 
multiplications and as many subtractions (both in line 4). Since we do n — | steps, k 
goes from | to m— 1 and thus the total number of operations in this forward 
elimination is 


n-1 n-1 
fa => @-H+2>5 @-bHa-k+D (writen — k = s) 
Kk=1 k=1 
n-1 n-1 
= S425) s(s + 1) a(n Dn + 2(n? Dn = 2n3 
s=1 s=1 


where 2n3/ 3 is obtained by dropping lower powers of n. We see that f(n) grows about 
proportional to n>, We say that f(n) is of order n° and write 


fn) = On") 
where O suggests order. The general definition of O is as follows. We write 
f(a) = O(h(n)) 
if the quotients | f(n)/h(n)| and |h(n)/f(n)| remain bounded (do not trail off to infinity) 


as n— ©, In our present case, h(n) = n? and, indeed, f(n)/n3 = 2 because the omitted 
terms divided by n® go to zero asn > ™, 
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In the back substitution of x; we make n — i multiplications and as many subtractions, 
as well as | division. Hence the number of operations in the back substitution is 


bn) =2> @-D+n=2>d stn=nn+lt+n=n? + 2 = OM. 


i=1 s=1 


We see that it grows more slowly than the number of operations in the forward elimination 
of the Gauss algorithm, so that it is negligible for large systems because it is smaller by 
a factor n, approximately. For instance, if an operation takes 107° sec, then the times 
needed are: 


Algorithm n = 1000 n = 10000 


Elimination 0.7 sec 


Back substitution 0.001 sec 


PROBLEM SET 2071 


APPLICATIONS of linear systems see Secs. 7.1 and 8.2. 7. —3x1 + 6x2 — 9x3 = —46.725 
1-3} GEOMETRIC INTERPRETATION X1 — 4x2 + 3x3 = 19.571 
Solve graphically and explain geometrically. 2x1 + 5x9 — Tx3 = —20.073 


1. — 4x9 = 20.1 
ms “a 8. 5x41 + 3x9 ale x3 > 2 
3x1 + Sxq = 5.9 
—4xo 5 a 8x3 =-3 


2. —5.00x1 + 8.40x2 = 0 


10x41 _ 6x2 Tr 26x3 = 0 
10.25x1 — 17.22x5 = 0 
9. 6xg + 13x3 = 137.86 
Be 7.2X4 _ 3.5x9 = 16.0 
6x41 = 8x3 = —85.88 
—14.4x, + 7.0x2 = 31.0 
13x4 = 8x2 = 178.54 
GAUSS ELIMINATION 
. 5 So cee , 10. 4x4 Ir 4x9 + 2x3 =0 
Solve the following linear systems by Gauss elimination, 
with partial pivoting if necessary (but without scaling). Show 3xy — Xo + 2x3 =0 
the intermediate steps. Check the result by substitution. If no 
: : : : 3x, + 7xg + x3 =0 
solution or more than one solution exists, give a reason. 
4, 6x4 + xg= —3 11. 3.4x1 6.12x9 2.72x3 =0 
4x, —2x29 = 6 —x, + 1.80x2 + 0.80x3 = 0 
5. 2x4 = 8x2 =--4 2.7x4 _ 4.86x9 =p 2.16x3 =0 
3xy + xg = 7 12. 5x, + 3x2 + xg = 2 
6. 25.38x, — 15.48x2 = 30.60 —4xo + 8xg = —3 
—14.10x1 + 8.60x2 = —17.00 10x, — 6x9 + 26x3 = 0 
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13. 


14. 


15. 


16. 


17. 


18. 
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3xo + 5x3= 1.20736 
3x1 — 4xo = —2.34066 
5x1 + 6x3 = —0.329193 
47x, + 4x — 7x3 = —118 
19x, — 3xg + 2x3 = 43 
—15x 1 + 5x2 = —25 
2.2x9 + 1.5x3 — 3.3x4 = —9.30 
0.2x1 + 1.8x9 +42x,= 9.24 
=f. Sexe 2.5%3 = —8.70 
0.5x4 — 3.8x3 + 1.5x4 = 11.94 
3.2x1 + 1.6x9 = -08 
1.6x1 — 0.8x9 + 2.4x3 = 16.0 
2.4x5 — 4.8x3 + 3.6x4 = —39.0 
3.6x3 + 2.4x4= 10.2 
CAS EXPERIMENT. Gauss Elimination. Write a 


program for the Gauss elimination with pivoting. 
Apply it to Probs. 13-16. Experiment with systems 
whose coefficient determinant is small in absolute 
value. Also investigate the performance of your 
program for larger systems of your choice, including 
sparse systems. 


TEAM PROJECT. Linear Systems and Gauss 
Elimination. (a) Existence and uniqueness. Find a 
and b such that ax, + x9 = b,x1 + xX» = 3 has (i) a 
unique solution, (ii) infinitely many solutions, (iii) no 
solutions. 


(b) Gauss elimination and nonexistence. Apply the 
Gauss elimination to the following two systems and 


compare the calculations step by step. Explain why the 
elimination fails if no solution exists. 


Xi xg + x3 = 3 
4x1 + 2x9 — x3 = 5 


9x4 + 5x2 = 3° 13 


xp Xg +x3 = 3 
4x1 + 2x9 — x3 = 5 
Oxy oh 5x2 = X30- 12. 


(c) Zero determinant. Why may a computer program 
give you the result that a homogeneous linear system 
has only the trivial solution although you know its 
coefficient determinant to be zero? 


(d) Pivoting. Solve System (A) (below) by the Gauss 
elimination first without pivoting. Show that for any 
fixed machine word length and sufficiently small e > 0 
the computer gives x2 = 1 and then x; = 0. What 
is the exact solution? Its limit as e ~0? Then solve 
the system by the Gauss elimination with pivoting. 
Compare and comment. 

(e) Pivoting. Solve System (B) by the Gauss elimination 
and three-digit rounding arithmetic, choosing (1) the first 
equation, (ii) the second equation as pivot equation. 
(Remember to round to 3S after each operation before 
doing the next, just as would be done on a computer!) 
Then use four-digit rounding arithmetic in those two 
calculations. Compare and comment. 


(A) €xX, + x2 = 1 


Xy+x2=2 


(B) 4.03x, + 2.16x5 = 


| 

| 
> 
a 
= 


| 

| 
a 
an 
\o 


6.21x1 + 3.35x29 = 


20.2 Linear Systems: LU-Factorization, 


Matrix Inversion 


We continue our discussion of numeric methods for solving linear systems of n equations 


in n unknowns x4,°°+, Xn, 

(1) Ax =b 

where A = [a;,] is the n Xn given coefficient matrix and x! = [x 1°°',Xy] and 
b' = [b1,:+-, by]. We present three related methods that are modifications of the Gauss 
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EXAMPLE 1 


elimination, which require fewer arithmetic operations. They are named after Doolittle, 
Crout, and Cholesky and use the idea of the LU-factorization of A, which we explain 
first. 

An LU-factorization of a given square matrix A is of the form 


(2) A= LU 
where L is lower triangular and U is upper triangular. For example, 


2 3 1 0} | 2 3 
A= =LU= 
8 5 4 1}|;0 —-7 


It can be proved that for any nonsingular matrix (see Sec. 7.8) the rows can be reordered 
so that the resulting matrix A has an LU-factorization (2) in which L turns out to be the 
matrix of the multipliers mj, of the Gauss elimination, with main diagonal 1,---, 1, and 
U is the matrix of the triangular system at the end of the Gauss elimination. (See Ref. 
[E5], pp. 155-156, listed in App. 1.) 

The crucial idea now is that L and U in (2) can be computed directly, without solving 
simultaneous equations (thus, without using the Gauss elimination). As a count shows, 
this needs about n>/ 3 operations, about half as many as the Gauss elimination, which 
needs about 2n3/3 (see Sec. 20.1). And once we have (2), we can use it for solving Ax = b 
in two steps, involving only about n? operations, simply by noting that Ax = LUx = b 
may be written 


(3) (a) Ly =b where (b) Ux=y 


and solving first (3a) for y and then (3b) for x. Here we can require that L have main 
diagonal 1,---, 1 as stated before; then this is called Doolittle’s method.! Both systems 
(3a) and (3b) are triangular, so we can solve them as in the back substitution for the Gauss 
elimination. 

A similar method, Crout’s method,” is obtained from (2) if U (instead of L) is required 
to have main diagonal 1,---, 1. In either case the factorization (2) is unique. 


Doolittle’s Method 
Solve the system in Example | of Sec. 20.1 by Doolittle’s method. 


Solution. The decomposition (2) is obtained from 


ay ay2 a3 3 5 2 1 0 0 U4qy uyj2 u43 
A= [ajx] =| a1 a2, a3 | = 0 8 2/= moa) 1 0 0 uo u93 
| 431 a32, 433 6 2 8 M31 M392, 1 0 0 433 | 


"MYRICK H. DOOLITTLE (1830-1913). American mathematician employed by the U.S. Coast and Geodetic 
Survey Office. His method appeared in U.S. Coast and Geodetic Survey, 1878, 115-120. 


*PRESCOTT DURAND CROUT (1907-1984), American mathematician, professor at MIT, also worked at 
General Electric. 
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by determining the mj, and uj,,, using matrix multiplication. By going through A row by row we get successively 


ayy = 3=1: “41 = 411 ay = S=1- uy2 = Uuyg2 


a3 =2= 1+ u43 = u43 


dg, = 0 = moytty1 dgg = 8 = moiU12 + Use dog = 2 = mgi13 + Ugg 
mo) = 0 ugg = 8 uo3 = 2 
431 = 6 = m3ylly1 a3 = 2 = m3iU12 + M3Qll99 433 = 8 = m3y13 + M3Ql23 + 33 
= mg31° 3 =2-+-5+4+ mg32°8 =2+2-—1+2+4 ugg 


m3) = 2 M32 = = u33 = 6 


Thus the factorization (2) is 


3 5 2 1 0 0})3 5 2 
0 8 2|= LU =]|0 1 0] | 0 8 2]. 
6 2 8 2 -1 1|/0 0 6 


We first solve Ly = b, determining y, = 8, then yg = —7, then yg from 2y; — yg + yg 


thus (note the interchange in b because of the interchange in A!) 


1 0 O}} y1 8 8 
0 1 0] | yo) =] -7]. Solution y=) SF 
[2-1 1 | | ys 26 3 


Then we solve Ux = y, determining x3 = 3 then xg, then x4, that is, 


3 5 2\ Ix 11 | 8] . 4] 
0 8 2||x2)/=] -7}. Solution x=/-1 
10 0 6][ x3 3 Z 


This agrees with the solution in Example 1| of Sec. 20.1. 


Our formulas in Example | suggest that for general n the entries of the matrices L = [mx] 
(with main diagonal 1,-:-,1 and mj, suggesting “multiplier”) and U = [uj,] in the 


Doolittle method are computed from 


Uik = Ak k=1,-+-,n 
qj . ; 
co a J=2,:++,n 
(4) a=) _ 
Ujk = Az — >, Mjslsk kK=j,+,n, j22 
s=1 


oa 


k-1 
ny = i Mgt 
jk Ugh, @ x js ws) 
s= 
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EXAMPLE 2 


Row Interchanges. Matrices, such as 


0 1 0 1 
or 
1 1 1 0O 


have no LU-factorization (try!). This indicates that for obtaining an LU-factorization, row 
interchanges of A (and corresponding interchanges in b) may be necessary. 


Cholesky’s Method 


For a symmetric, positive definite matrix A (thus A = A’, x"Ax > 0 for all x # 0) we 
can in (2) even choose U = Li, thus uj, = mj; (but cannot impose conditions on the 
main diagonal entries). For example, 


4 2 14 2 0 0 2 1 7 
(5) A=} 2 17. -5|/=LL'=| 1 4 0 0 4 -3 
144 —-5 83 7 =3 5 0 0 5 
The popular method of solving Ax = b based on this factorization A = LL’ is called 


Cholesky’s method. In terms of the entries of L = [/;,,] the formulas for the factorization 
are 


ly = Vay 
aj1 
Lj = i. J=H2, nN 
11 
go . 
(6) lig = pe Se f=2eoryn 
s=1 
1 — 
ls = (cos — Stolen) p=jtl,---.m j22 
Jd s=1 


If A is symmetric but not positive definite, this method could still be applied, but then 
leads to a complex matrix L, so that the method becomes impractical. 


Cholesky’s Method 
Solve by Cholesky’s method: 


4x4 2x9 14x3 = 14 


2X4 a 17x92 = 5x3 = -101 


14x41 = 5x2 aly. 83x3 = 155. 


3 ANDRE-LOUIS CHOLESKY (1875-1918), French military officer, geodecist, and mathematician. Surveyed 
Crete and North Africa. Died in World War I. His method was published posthumously in Bulletin Géodésique 
in 1924 but received little attention until JOHN TODD (191 1—2007) — Irish-American mathematician, numerical 
analysist, and early pioneer of computer methods in numerics, professor at Caltech, and close personal friend 
and collaborator of ERWIN KREYSZIG, see [E20]—taught Cholesky’s method in his analysis course at King’s 
College, London, in the 1940s. 
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Solution. From (6) or from the form of the factorization 


4 ) 14 li 0 0 hi lay Igy 


we compute, in the given order, 


dai 2 a31 14 
y= Vay=2 Igy =——~ = -= 1 lye SS 
ly 2 yy 2 


loo d29, 13, V17 1 4 


1 1 
Igo (a32 — I3yl21) (=5= 7D 3 
log 4 


Ig3 = Vagg — 12, — 12g = V83 — F — (-3)% = 5. 


This agrees with (5). We now have to solve Ly = b, that is, 


2 0 O}}y1 14 7 
1 4 0] | v2 |=] —101 |. Solution y =| -27}. 
[7 —3 5 || y3 155 5 
As the second step, we have to solve Ux = L'x = y, that is, 
Be 1 7| [x ‘| . 7] . 3] 
0 4 ~-3)||x2/=|-27]}. Solution x=] -6]. HB 
LO 0 5 || x3 5 1 


Stability of the Cholesky Factorization 
The Cholesky LL" -factorization is numerically stable (as defined in Sec. 19.1). 


We have aj; = 13 a ae % ae eee il 3 by squaring the third formula in (6) and solving it 
for aj;. Hence for all 1, (note that /;;, = 0 for k > j) we obtain (the inequality being trivial) 


2 2 2 a 
Vig SB Uj + Ug +++ + 1G = ayy. 


That is, Li is bounded by an entry of A, which means stability against rounding. 


Gauss—Jordan Elimination. Matrix Inversion 


Another variant of the Gauss elimination is the Gauss—Jordan elimination, introduced 
by W. Jordan in 1920, in which back substitution is avoided by additional computations 
that reduce the matrix to diagonal form, instead of the triangular form in the Gauss 
elimination. But this reduction from the Gauss triangular to the diagonal form requires 
more operations than back substitution does, so that the method is disadvantageous for 
solving systems Ax = b. But it may be used for matrix inversion, where the situation is 
as follows. 
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The inverse of a nonsingular square matrix A may be determined in principle by solving 


the n systems 


(7) 


where b; is the jth column of the n X n unit matrix. 


However, it is preferable to produce At by operating on the unit matrix I in the same 
way as the Gauss—Jordan algorithm, reducing A to I. A typical illustrative example of this 


method is given in Sec. 7.8. 


PROBLEM SET 20-2 


1-5| DOOLITTLE’S METHOD 


Show the factorization and solve by Doolittle’s method. 


1, 4x1 Slr: 5x9 = 14 


36 


ll 


12x, + 14x9 
2. 2x, + 9x2 = 82 

3x1 — 5x9 = —62 
3. 5x, + 4x9 + x3 = 68 


10x4 9x9 4x3 = 17.6 


10x, + 13x_ + 15x3 = 38.4 


4, 2x4 X92 2x3 = 0 


2x41 t 2x9 x3 > 0 


53 3x4 9x9 6x3 = 46 


18x1 + 48xo + 39x3 = 27.2 


9x4 = 27x92 ae 42x3 = 9.0 


6. TEAM PROJECT. Crout’s method _factorizes 
A = LU, where L is lower triangular and U is upper 
triangular with diagonal entries uj; = 1,j = 1,---,n. 
(a) Formulas. Obtain formulas for Crout’s method 
similar to (4). 

(b) Examples. Solve Prob. 5 by Crout’s method. 


(c) Factor the following matrix by the Doolittle, 
Crout, and Cholesky methods. 


. 1 -4 > 
=A. BS 4 
| 2 4. 24 


(d) Give the formulas for factoring a tridiagonal 
matrix by Crout’s method. 


(e) When can you obtain Crout’s factorization from 
Doolittle’s by transposition? 


7-12 


10. 


11. 


12. 


CHOLESKY’S METHOD 


Show the factorization and solve. 


7. 9x1 + 6x2q + 12xg = 174 
6x1, + 13x29 + 1lx3 = 23.6 
12x, + Ilx2g + 26xzg = 30.8 
~ Ax, 6x2 8x3 = 0 
6x1 + 34x92 52x3 = —160 
8x4 + 52xq + 129x3 = —452 
. O.01x, + 0.03x3 = 0.14 
0.16xg + 0.08x3 = 0.16 
0.03x1 + 0.08x2 + 0.14x3 = 0.54 
4x1 + 2x3 = 1.5 
4xg + xg = 4.0 
2x4 + Xo + 2x3 = 2.5 
Xy x2 3x3 2x4= 15 
x1 + 5x9 5x3 2x4 = —35 
3x1 — 5xq + 19x3 3x4 = 94 
2x1 — 2x2 3xg + 21x4 = 1 
4x1 2xo + 4x3 = 20 
2x4 2x9 + 3x3 + 2x4 36 
4x1 + 3xg + 6x3 + 3x4 = 60 
2xg + 3x3 + 9x4 = 122 


13. 


Definiteness. Let A, B be n X n and positive definite. 


Are —A, AT, A + B, A — B positive definite? 
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14. CAS PROJECT. Cholesky’s Method. (a) Write a 
program for solving linear systems by Cholesky’s 
method and apply it to Example 2 in the text, to Probs. 
7-9, and to systems of your choice. 


(b) Splines. Apply the factorization part of the 
program to the following matrices (as they occur in 
(9), Sec. 19.4 (with G= 1), in connection with 


15-19| INVERSE 


Find the inverse by the Gauss—Jordan method, showing the 
details. 


15. In Prob. 1 
17. In Team Project 6(c) 
19. In Prob. 12 


16. In Prob. 4 
18. In Prob. 9 


20. Rounding. For the following matrix A find det A. 


li ; 

Splnes) What happens if you roundoff the given entries to 
(a) 5S, (b) 4S, (c) 3S, (d) 2S, (e) 1S? What is the 

= vi) 1 0 0 practical implication of your work? 

2 1 = 
1 4 1 #0 3 4 2 
1 4 ; 
0 1 4 1 A=|-9 1 4 
0 1 
= 6 f 3 | 6 28 49 


20.3 Linear Systems: Solution by Iteration 


EXAMPLE 1 


The Gauss elimination and its variants in the last two sections belong to the direct methods 
for solving linear systems of equations; these are methods that give solutions after an 
amount of computation that can be specified in advance. In contrast, in an indirect or 
iterative method we start from an approximation to the true solution and, if successful, 
obtain better and better approximations from a computational cycle repeated as often as 
may be necessary for achieving a required accuracy, so that the amount of arithmetic 
depends upon the accuracy required and varies from case to case. 

We apply iterative methods if the convergence is rapid (if matrices have large main 
diagonal entries, as we shall see), so that we save operations compared to a direct method. 
We also use iterative methods if a large system is sparse, that is, has very many zero 
coefficients, so that one would waste space in storing zeros, for instance, 9995 zeros per 
equation in a potential problem of 16" equations in 10* unknowns with typically only 5 
nonzero terms per equation (more on this in Sec. 21.4). 


Gauss—Seidel Iteration Method* 


This is an iterative method of great practical importance, which we can simply explain in 
terms of an example. 


Gauss-Seidel Iteration 


We consider the linear system 


x1 — 0.25xg — 0.25x3 = 50 


—0.25x1 + Xo — 0.25x4 = 50 
(1) 


—0.25x1 + x3 — 0.25x4 = 25 


0.25x2 — 0.25x3 4 Xq = 25. 


4PHILIPP LUDWIG VON SEIDEL (1821-1896), German mathematician. For Gauss see footnote 5 in 


Sec. 5.4. 
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(Equations of this form arise in the numeric solution of PDEs and in spline interpolation.) We write the system 


in the form 
x1 0.25x2 a 0.25x3 + 50 
Xg = 0.25x41 + 0.25x4 + 50 
(2) 
x3 = 0.25x, + 0.25x4 + 25 
x4 0.25x2 + 0.25x3 +25. 


These equations are now used for iteration; that is, we start from a (possibly poor) approximation to the solution, 


say x = 100, x = 100, x8 = 100, x = 100, and compute from (2) a perhaps better approximation 


Use “old” values 
(“New” values here not yet available) 


0.252 + 0.25) + 50.00 = 100.00 


(3) xi) = 0.25x. | + 50.00 = 100.00 
eae 0.252 | + 25.00 = 75.00 
ove 0.2525) + 0.2540) + 25.00 = 68.75 


Use “new” values 


These equations (3) are obtained from (2) by substituting on the right the most recent approximation for each 
unknown. In fact, corresponding values replace previous ones as soon as they have been computed, so that in 


the second and third equations we use x§P (not x, and in the last equation of (3) we use xP and x9 (not 


x¥ and x9). Using the same principle, we obtain in the next step 
xP = 0.25xSP + 0.25x? + 50.00 = 93.750 
xX = 0.25x? + 0.25xP + 50.00 = 90.625 
x2 = 0.25x9? + 0.25xP + 25.00 = 65.625 
xP = 0.25x%? + 0.252? + 25.00 = 64.062 


Further steps give the values 


xy X2 X3 X4 


89.062 88.281 63.281 62.891 
87.891 87.695 62.695 62.598 
87.598 87.549 62.549 62.524 
87.524 87.512 62.512 62.506 
87.506 87.503 62.503 62.502 


Hence convergence to the exact solution xy = x2 = 87.5, x3 = x4 = 62.5 (verify!) seems rather fast. 2] 


An algorithm for the Gauss-Seidel iteration is shown in Table 20.2. To obtain the 
algorithm, let us derive the general formulas for this iteration. 

We assume that aj = 1 for 7 = 1,-+-,n. (Note that this can be achieved if we can 
rearrange the equations so that no diagonal coefficient is zero; then we may divide each 
equation by the corresponding diagonal coefficient.) We now write 
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(4) A=I1+L+U (ay = 1) 


where Tis the n X n unit matrix and L and U are, respectively, lower and upper triangular 
matrices with zero main diagonals. If we substitute (4) into Ax = b, we have 


Ax = (I1+L+U)x=b. 
Taking Lx and Ux to the right, we obtain, since Ix = x, 
(5) x =b — Lx — Ux. 


Remembering from (3) in Example 1 that below the main diagonal we took “new” 
approximations and above the main diagonal “old” ones, we obtain from (5) the desired 


iteration formulas 
“New” “Ola” 


(6) xt =pb=- Lx@tP a Ux” (ayy = 1) 


where x” = iia is the mth approximation and x“"*? = lke is the (m + 1)st 
approximation. In components this gives the formula in line | in Table 20.2. The matrix 
A must satisfy aj; # 0 for all j. In Table 20.2 our assumption aj; = 1 is no longer required, 
but is automatically taken care of by the factor 1/a;; in line 1. 


Table 20.2 Gauss-Seidel Iteration 


ALGORITHM GAUSS-SEIDEL (A, b, x, €, N) 


This algorithm computes a solution x of the system Ax = b given an initial approximation 


x where A = [ajc] is ann X n matrix with aj #0,j = 1,-++,n. 
INPUT: A, b, initial approximation x® tolerance € > 0, maximum number 
of iterations N 
OUTPUT: Approximate solution x°” = [xo] or failure message that x“ does 
not satisfy the tolerance condition 
For m = 0,::-,N— 1, do: 
For j = 1,+--+,2, do: 
1 1 — . 
Cnt) _ (m+1) (m) 
1 Xj = Gu @ = S AjnX he” = > ant”) 
i k=1 emia 
End 
2 If max |xj"*? — x9”| < € |x§"*)| then OUTPUT x°”*”. Stop 
j 
[Procedure completed successfully] 


End 
OUTPUT: “No solution satisfying the tolerance condition obtained after NV 
iteration steps.” Stop 
[Procedure completed unsuccessfully] 
End GAUSS-SEIDEL 
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Convergence and Matrix Norms 


An iteration method for solving Ax = b is said to converge for an initial x if the 
corresponding iterative sequence gO ag ees converges to a solution of the given 
system. Convergence depends on the relation between x” andx""*) To get this relation 
for the Gauss-Seidel method, we use (6). We first have 

(I+ L) xt =b-— Ux™ 


and by multiplying by (I + L)~+ from the left, 
(7) xo Mex eS Lb where C=-0+L)1U. 


The Gauss-Seidel iteration converges for every x if and only if all the eigenvalues 
(Sec. 8.1) of the “iteration matrix” C = [c;,] have absolute value less than 1. (Proof in 
Ref. [E5], p. 191, listed in App. 1.) 

CAUTION! If you want to get C, first divide the rows of A by aj; to have main diagonal 
1,---, 1. If the spectral radius of C (= maximum of those absolute values) is small, then 
the convergence is rapid. 


Sufficient Convergence Condition. A sufficient condition for convergence is 
(8) |C|| < 1. 


Here ||C|| is some matrix norm, such as 


(9) I|C|| = {rd (Frobenius norm) 
j=1k=1 


or the greatest of the sums of the I cine in a column of C 


nr 
(10) [Cl] = max J Ij (Column “sum” norm) 
k 
jel 


or the greatest of the sums of the Icjrl in a row of C 


n 
(11) IC] = max > Icjrcl (Row “sum” norm). 
I =1 


These are the most frequently used matrix norms in numerics. 

In most cases the choice of one of these norms is a matter of computational convenience. 
However, the following example shows that sometimes one of these norms is preferable 
to the others. 
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Test of Convergence of the Gauss-Seidel Iteration 


Test whether the Gauss-Seidel iteration converges for the system 


m+ yt oad r=2-hy- he 
xt2yt+ z=4 written y=2-4x—-4z 
x+ yt2z=4 z=2—4x—dy. 
Solution. The decomposition (multiply the matrix by 3 — why?) is 
[1g | fo o oo] fo £ 4 
4 1 d]/=1+L+U=1+]%4 O Of+]o0 o 4 
(e. 2 4] 1s & of lo o of 
It shows that 
[1 0 olfo 4 4) fo -4 -3 
ea=045"U 2 tf OOO Slal@ Gs 
l-z -2 1,0 o oOo] jo | 3 
We compute the Frobenius norm of C 
IIC\| @ ba tae tae we4 ale (R)* 0.884 < 1 


and conclude from (8) that this Gauss—Seidel iteration converges. It is interesting that the other two norms would 
permit no conclusion, as you should verify. Of course, this points to the fact that (8) is sufficient for convergence 


rather than necessary. a 
Residual. Given a system Ax = b, the residual r of x with respect to this system is 
defined by 

(12) r=b-— Ax. 


Clearly, r = 0 if and only if x is a solution. Hence r # 0 for an approximate solution. In 
the Gauss-Seidel iteration, at each stage we modify or relax a component of an 
approximate solution in order to reduce a component of r to zero. Hence the Gauss-Seidel 
iteration belongs to a class of methods often called relaxation methods. More about the 
residual follows in the next section. 


Jacobi Iteration 


The Gauss-Seidel iteration is a method of successive corrections because for each 
component we successively replace an approximation of a component by a corresponding 
new approximation as soon as the latter has been computed. An iteration method is called 
a method of simultaneous corrections if no component of an approximation x“” is used 
until all the components of x“ have been computed. A method of this type is the Jacobi 
iteration, which is similar to the Gauss—Seidel iteration but involves not using improved 
values until a step has been completed and then replacing ie by x™* at once, directly 
before the beginning of the next step. Hence if we write Ax = b (with aj; = 1 as before!) 
in the form x = b + (I — A)x, the Jacobi iteration in matrix notation is 
xmrb =b+a- A)x'™ 


(13) (ajj = 1). 


SEC. 20.3 Linear Systems: Solution by Iteration 


863 


This method converges for every choice of x© if and only if the spectral radius of I — A 
is less than 1. It has recently gained greater practical interest since on parallel processors 
all n equations can be solved simultaneously at each iteration step. 

For Jacobi, see Sec. 10.3. For exercises, see the problem set. 


PROBLEM SET 20-3 


1. 
2. 


3. 


Verify the solution in Example 1 of the text. 


Show that for the system in Example 2 the Jacobi 
iteration diverges. Hint. Use eigenvalues. 


Verify the claim at the end of Example 2. 


4-10 


GAUSS-SEIDEL ITERATION 


Do 5 steps, starting from xp = [1 1 


1]' and using 6S in 


the computation. Hint. Make sure that you solve each equation 
for the variable that has the largest coefficient (why?). Show 
the details. 


4. 


10. 


4x1 — Xe = 21 
x1 + 4x9 x3 45 
— xg+4x3 = 33 
10x41 Xo x3 = 6 
x1 + 10x92 x3 = 6 
xy Xo + 10x3 = 6 
xg+ 7x3 = 25.5 
5x, + Xo = 0 
x1 + 6x9 x3 = —10.5 
5x4 2x9 = 18 
2x1 + 10x2 2x3 = —60 
— 2xeg+ 15x3 = 128 
3x1 + 2x9 x3 =7 
xy + 3xg + 2x3 =4 
2x4 Xo + 3x3 =7 
5x4 Xo + 2x3 = 19 
xy + 4x9 — 2x3 = —2 
2x4 + 3x9 + 8x3 = 39 
4x, + 5x3 = 125 
Xy + 6xg + 2x3 = 185 
8x1 + 2x2 x3 = —11.5 


11. 


12. 


13. 


Apply the Gauss-Seidel iteration (3 steps) to the system 
in Prob. 5, starting from (a) 0,0,0 (b) 10, 10, 10. 
Compare and comment. 


In Prob. 5, compute C (a) if you solve the first equation 
for x1, the second for x, the third for x3, proving 
convergence; (b) if you nonsensically solve the third 
equation for x1, the first for x9, the second for x3, proving 
divergence. 


CAS Experiment. Gauss-Seidel Iteration. (a) Write 
a program for Gauss-Seidel iteration. 


(b) Apply the program A(‘)x = b, to starting from 
[0 0 OJ", where 


1 t t 2 
A(t) =| ¢ 1 alte b =| 2 
t t 1 2 


For t = 0.2, 0.5, 0.8,0.9 determine the number of 
steps to obtain the exact solution to 6S and the 
corresponding spectral radius of C. Graph the number 
of steps and the spectral radius as functions of t and 
comment. 


(c) Successive overrelaxation (SOR). Show that by 
adding and subtracting x” on the right, formula (6) 
can be written 


xn tb) = x +b-— Lx@™tb _ (U + Dx” 
(qj ad 1). 


Anticipation of further corrections motivates the 
introduction of an overrelaxation factor w > 1 to get 
the SOR formula for Gauss-Seidel 


xen — xm a o(b = Lx@tb) 


14 
al —(U+ Dx”) (aj = 1) 


intended to give more rapid convergence. A rec- 
ommended value is w = 2/(1 + V1 — p), where p is 
the spectral radius of C in (7). Apply SOR to the matrix 
in (b) for tf = 0.5 and 0.8 and notice the improvement of 
convergence. (Spectacular gains are made with larger 
systems.) 
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14-17 | JACOBI ITERATION 18-20 | NORMS 
Do 5 steps, starting from x9 = [1 1 1]. Compare with Compute the norms (9), (10), (11) for the following (square) 
the Gauss-Seidel iteration. Which of the two seems to matrices. Comment on the reasons for greater or smaller 
converge faster? Show the details of your work. differences among the three numbers. 

14. The system in Prob. 4 18. The matrix in Prob. 10 


15. The system in Prob. 9 

16. The system in Prob. 10 

17. Show convergence in Prob. 16 by verifying that I — A, 2k kk 
where A is the matrix in Prob. 16 with the rows divided 20. k —2k k 


by the corresponding main diagonal entries, has the 
eigenvalues —0.519589 and 0.259795 + 0.246603i. =k. =k 2k 


19. The matrix in Prob. 5 


20.4 Linear Systems: II|-Conditioning, Norms 


One does not need much experience to observe that some systems Ax = b are good, 
giving accurate solutions even under roundoff or coefficient inaccuracies, whereas others 
are bad, so that these inaccuracies affect the solution strongly. We want to see what is 
going on and whether or not we can “trust” a linear system. Let us first formulate the two 
relevant concepts (ill- and well-conditioned) for general numeric work and then turn to 
linear systems and matrices. 

A computational problem is called ill-conditioned (or il/-posed) if “small” changes in 
the data (the input) cause “large” changes in the solution (the output). On the other hand, 
a problem is called well-conditioned (or well-posed) if “small” changes in the data cause 
only “small” changes in the solution. 

These concepts are qualitative. We would certainly regard a magnification of inaccuracies 
by a factor 100 as “large,” but could debate where to draw the line between “large” and 
“small,” depending on the kind of problem and on our viewpoint. Double precision may 
sometimes help, but if data are measured inaccurately, one should attempt changing the 
mathematical setting of the problem to a well-conditioned one. 

Let us now turn to linear systems. Figure 445 explains that ill-conditioning occurs if 
and only if the two equations give two nearly parallel lines, so that their intersection point 
(the solution of the system) moves substantially if we raise or lower a line just a little. 
For larger systems the situation is similar in principle, although geometry no longer helps. 
We shall see that we may regard ill-conditioning as an approach to singularity of the 
matrix. 


y y 


(a) (0) 


Fig. 445. (a) Well-conditioned and (b) ill-conditioned 
linear system of two equations in two unknowns 


SEC. 20.4 Linear Systems: Il|-Conditioning, Norms 865 


EXAMPLE 1 


EXAMPLE 2 


An Ill-Conditioned System 


You may verify that the system 


0.9999x — 1.0001y 


ll 
_ 


x= y=1 
has the solution x = 0.5, y = —0.5, whereas the system 


0.9999x — 1.000ly = 1 


ce y=lt+e 


has the solution x = 0.5 + 5000.5e, y = —0.5 + 4999.5e. This shows that the system is ill-conditioned because 
a change on the right of magnitude € produces a change in the solution of magnitude 5000e, approximately. 
We see that the lines given by the equations have nearly the same slope. 


Well-conditioning can be asserted if the main diagonal entries of A have large absolute 
values compared to those of the other entries. Similarly if A? and A have maximum 
entries of about the same absolute value. 


Ill-conditioning is indicated if A~* has entries of large absolute value compared to those 
of the solution (about 5000 in Example 1) and if poor approximate solutions may still 
produce small residuals. 


Residual. The residual r of an approximate solution x of Ax = b is defined as 
(1) r=b — Ax. 

Now b = Ax, so that 

(2) r = A(x — AX). 

Hence r is small if x has high accuracy, but the converse may be false: 


Inaccurate Approximate Solution with a Small Residual 
The system 
1.0001x4 + X29 = 2.0001 


x1 + 1.0001x2 = 2.0001 


has the exact solution x; = 1, xg = 1. Can you see this by inspection? The very inaccurate approximation 
¥1 = 2.0000, Xz = 0.0001 has the very small residual (to 4D) 


2.0001 1.0001 1.0000 | | 2.0000 2.0001 2.0003 —0.0002 
r= = = = = . 
2.0001 1.0000 1.0001 } | 0.0001 2.0001 2.0001 0.0000 


From this, a naive person might draw the false conclusion that the approximation should be accurate to 3 or 4 
decimals. 

Our result is probably unexpected, but we shall see that it has to do with the fact that the system is 
ill-conditioned. iz 


Our goal is to show that ill-conditioning of a linear system and of its coefficient matrix A 
can be measured by a number, the condition number x(A). Other measures for ill-conditioning 
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have also been proposed, but k(A) is probably the most widely used one. «(A) is defined in 
terms of norm, a concept of great general interest throughout numerics (and in modern 
mathematics in general!). We shall reach our goal in three steps, discussing 


1. Vector norms 
2. Matrix norms 


3. Condition number « of a square matrix 


Vector Norms 


A vector norm for column vectors x = [x,;] with n components (n fixed) is a generalized 
length or distance. It is denoted by ||x|| and is defined by four properties of the usual 
length of vectors in three-dimensional space, namely, 


(a) ||x|| is a nonnegative real number. 
(b) ||x||=0 if andonly if x =0. 


3 
” (c) [kx] = [kl |x|] for all &. 


(d) |/x + y|| S|lx|] + llyl| (Triangle inequality). 


If we use several norms, we label them by a subscript. Most important in connection with 
computations is the p-norm defined by 


(4) [xp = Claal? + eal? + +++ + lxnl?Y” 


where p is a fixed number and p 2 1. In practice, one usually takes p = 1 or 2 and, as a 


third norm, ||x||.. (the latter as defined below), that is, 
(5) xl = [oi esteeeet= sec (“7,-norm’’) 
(6) |x|lz = Vid or aby (“Euclidean” or “/y-norm’”’) 
(7) I|x||-. = Tne Ix;| (“l..-norm’’). 
j 


For n = 3 the /g-norm is the usual length of a vector in three-dimensional space. The 
1,-norm and /,,-norm are generally more convenient in computation. But all three norms 
are in common use. 


Vector Norms 

Ifx’=[2 -3 0 1 4], then||x|,; = 10, |xllp = V30,  ||x||.. = 4. | 
In three-dimensional space, two points with position vectors x and X have distance |x — x| 
from each other. For a linear system Ax = b, this suggests that we take ||x — X|| as a 


measure of inaccuracy and call it the distance between an exact and an approximate 
solution, or the error of X. 


Matrix Norm 


If A is ann X n matrix and x any vector with n components, then Ax is a vector with n 
components. We now take a vector norm and consider ||x|| and ||Ax||. One can prove (see 
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EXAMPLE 4 


Ref. [E17]. pp. 77, 92-93, listed in App. 1) that there is a number c (depending on A) 
such that 


(8) || Ax|| = c||x|| for all x. 


Let x # 0. Then||x|| > 0 by (3b) and division gives || Ax||/||x|| = c. We obtain the smallest 
possible c valid for all x (# 0) by taking the maximum on the left. This smallest c is 
called the matrix norm of A corresponding to the vector norm we picked and is denoted 
by ||A||. Thus 


|| Ax 


(9) ||A|]| = max ——— 
Ix! 


(x # 0), 


the maximum being taken over all x # 0. Alternatively [see (c) in Team Project 24], 


(10) Al] = max [Ax]. 


The maximum in (10) and thus also in (9) exists. And the name “matrix norm’ is 
justified because || A\|| satisfies (3) with x and y replaced by A and B. (Proofs in Ref. [E17] 
pp. 77, 92-93.) 

Note carefully that || A|| depends on the vector norm that we selected. In particular, one 
can show that 


for the /,-norm (5) one gets the column “sum” norm (10), Sec. 20.3, 


for the /,.-norm (7) one gets the row “sum” norm (11), Sec. 20.3. 
By taking our best possible (our smallest) c = ||A|| we have from (8) 
(11) ||Ax|] = All [xl 


This is the formula we shall need. Formula (9) also implies for two n X n matrices (see 
Ref. [E17], p. 98) 


(12) || AB|| = |All] |B 


| thus A" = |All”. 


See Refs. [E9] and [E17] for other useful formulas on norms. 
Before we go on, let us do a simple illustrative computation. 


Matrix Norms 


Compute the matrix norms of the coefficient matrix A in Example | and of its inverse AL assuming that we 
use (a) the /,-vector norm, (b) the /...-vector norm. 


Solution. We use (4*), Sec. 7.8, for the inverse and then (10) and (11) in Sec. 20.3. Thus 


0.9999 —1.00019 -—5000.0 5000.59 
A= ee ; 
L1.0000 —1.00001 L—5000.0 4999.5 


(a) The /,-vector norm gives the column “sum” norm (10), Sec. 20.3; from Column 2 we thus obtain 
\|A|| = |—1.0001] + |—1.0000| = 2.0001. Similarly, ||A7*]] = 10,000. 
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(b) The /..-vector norm gives the row “sum” norm (11), Sec. 20.3; thus ||A|] = 2, |A7*|| = 10000.5 from 
Row 1. We notice that || A7+|| is surprisingly large, which makes the product || A\| || A7| large (20,001). We shall 
see below that this is typical of an ill-conditioned system. 


Condition Number of a Matrix 


We are now ready to introduce the key concept in our discussion of ill-conditioning, the 
condition number «(A) of a (nonsingular) square matrix A, defined by 


(13) x(A) = ||Al||A7?|]. 


The role of the condition number is seen from the following theorem. 


Condition Number 


A linear system of equations Ax = b and its matrix A whose condition number (13) 
is small are well-conditioned. A large condition number indicates ill-conditioning. 


b = Ax and (11) give ||b|| S ||A|| ||x||. Let b 4 0 and x 4 0. Then division by ||b]| ||x| 
gives 


(14) tell 
IIx [Ib] 


Multiplying (2) r = A(x — X) by A”? from the left and interchanging sides, we have 
x — ¥ = A’! r. Now (11) with A“? and r instead of A and x yields 


[x — X= ]A*r|] SAT [lr 
Division by ||x|| [note that ||x|| # 0 by (3b)] and use of (14) finally gives 


Pe go ty pe seep acaypey = cay et. 
Ix) [lx [pI |p| 


(15) 


Hence if «(A) is small, a small ||r||/||b|] implies a small relative error ||x — X||/||x||, so 
that the system is well-conditioned. However, this does not hold if «(A) is large; then a 
small ||r|j/||b|| does not necessarily imply a small relative error ||x — X|j/||x||. a 


Condition Numbers. Gauss-Seidel Iteration 


is ft 2] [12 -2 -2] 
1 
A=]1 4 2 has the inverse ACh = sh =2 19 —9}. 
[1 2 4 | | -2 —9 19 | 


Since A is symmetric, (10) and (11) in Sec. 20.3 give the same condition number 
(A) = ||Al] ||A7+|] = 7+ 3g - 30 = 3.75. 


We see that a linear system Ax = b with this A is well-conditioned. 
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EXAMPLE 6 


For instance, if b = [14 0 28)", the Gauss algorithm gives the solution x = [2 —S5 oy", (confirm 
this). Since the main diagonal entries of A are relatively large, we can expect reasonably good convergence of 
the Gauss-Seidel iteration. Indeed, starting from, say, x9 = [1 1 Wy, we obtain the first 8 steps (3D values) 


xy x2 x3 
1.000 1.000 1.000 
2.400 —1.100 6.950 
1.630 —3.882 8.534 
1.870 —4,734 8.900 
1.967 —4,942 8.979 
1.993 —4.988 8.996 
1.998 —4,997 8.999 
2.000 —5.000 9.000 
2.000 —5.000 9.000 2] 


Ill-Conditioned Linear System 


Example 4 gives by (10) or (11), Sec. 20.3, for the matrix in Example | the very large condition number 
k(A) = 2.0001 - 10000 = 2 + 10000.5 = 200001. This confirms that the system is very ill-conditioned. 
Similarly in Example 2, where by (4*), Sec. 7.8 and 6D-computation, 


5000.5 —5.000.0 


” 1 1.0001 —1.0000 
~ 0.0002 


— 1.0000 1.0001 —5000.0 5000.5 


so that (10), Sec. 20.3, gives a very large (A), explaining the surprising result in Example 2, 


«(A) = (1.0001 + 1.0000)(5000.5 + 5000.0) ~ 20,002. B 


In practice, A~? will not be known, so that in computing the condition number «(A), one 
must estimate || A~ “|. A method for this (proposed in 1979) is explained in Ref. [E9] listed 
in App. 1. 


Inaccurate Matrix Entries. (A) can be used for estimating the effect 5x of an inaccuracy 
6A of A (errors of measurements of the aj, for instance). Instead of Ax = b we then have 


(A + 6A)(x + 6x) = b. 
Multiplying out and subtracting Ax = b on both sides, we obtain 
Aéx + 6A(x + 6x) = 0. 
Multiplication by A”? from the left and taking the second term to the right gives 
dx = —A7!8A(x + 6x). 
Applying (11) with A7? and vector SA(x + 6x) instead of A and x, we get 
||8x|] = | ASA + 5x)|| = ]A7"|| |]SAC& + 5x)]]. 
Applying (11) on the right, with 6A and x — 6x instead of A and x, we obtain 


[|5x|| = |A~*|[[SAl| [x + Sxl]. 
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Now ||A71|| = «(A)/||A\] by the definition of «(A), so that division by ||x + 6x|| shows 
that the relative inaccuracy of x is related to that of A via the condition number by the 
inequality 


l|ox|| [lax a |6A | 
= [A |6Al] = «(A)-——- 


(16) = < 
IIx] [Ix + 6x [Al] 


Conclusion. If the system is well-conditioned, small inaccuracies ||5A\||/|| A|] can have 
only a small effect on the solution. However, in the case of ill-conditioning, if || 6A ||/|| A]| 
is small, || 5x||/||x|| may be large. 


Inaccurate Right Side. You may show that, similarly, when A is accurate, an inaccuracy 
ob of b causes an inaccuracy 6x satisfying 


| 6x| ||5b|| 
= 


<= (A) 
xl ~~ [ipl 


(17) 


Hence || 6x||/||x|| must remain relatively small whenever x(A) is small. 


Inaccuracies. Bounds (16) and (17) 


If each of the nine entries of A in Example 5 is measured with an inaccuracy of 0.1, then ||5A|| = 9 - 0.1 and 
(16) gives 
|x| 3-041 
S75 += 
IIx! 7 


= 0.321 thus |x|] = 0.321 |x|] = 0.321 - 16 = 5.14. 


By experimentation you will find that the actual inaccuracy ||5x|| is only about 30% of the bound 5.14. This is 
typical. 
Similarly, if Sb = [0.1 0.1 0.1]", then ||Sb|| = 0.3 and ||b|| = 42 in Example 5, so that (17) gives 


0. 
—— 375: 2 = 0.0536, hence || 5x|| S 0.0536 - 16 = 0.857 
but this bound is again much greater than the actual inaccuracy, which is about 0.15. | 


Further Comments on Condition Numbers. The following additional explanations 
may be helpful. 


1. There is no sharp dividing line between “well-conditioned” and “ill-conditioned,” 
but generally the situation will get worse as we go from systems with small «(A) to systems 
with larger k(A). Now always x(A) = 1, so that values of 10 or 20 or so give no reason 
for concern, whereas k(A) = 100, say, calls for caution, and systems such as those in 
Examples | and 2 are extremely ill-conditioned. 


2. If «(A) is large (or small) in one norm, it will be large (or small, respectively) in 
any other norm. See Example 5. 


3. The literature on ill-conditioning is extensive. For an introduction to it, see [E9]. 


This is the end of our discussion of numerics for solving linear systems. In the next section 
we consider curve fitting, an important area in which solutions are obtained from linear systems. 


SEC. 20.4 Linear Systems: Ill-Conditioning, Norms 


1-6} VECTOR NORMS 


Compute the norms (5), (6), (7). Compute a corresponding 
unit vector (vector of norm 1) with respect to the /..-norm. 


1.[1 -3 8 0 -6 O] 

2. [4 -1 8] 

3. [0.2 0.6 —2.1 3.0] 

4, [k*, 4k Kl k>4 

5.f1 1 1 1 J 

6. [0 0 0 1 QO 

7. For what x = [a bc] will ||x||z = ||x|l2? 
8 


. Show that ||x||.. = ||x|l2 = ||x|h1. 
[ 9-16 | MATRIX NORMS, 
CONDITION NUMBERS 


Compute the matrix norm and the condition number 
corresponding to the /4-vector norm. 


ee d ) 
11. 12, 
0 -VW5 6 5 


13..|.—2 3 0 14. | 0.01 1 0.01 


15.) 0 005 0O 


[21 105 7 5.25 


10.5 7 5.25. 4.2 
16. 
7 5.25 42 3.5 


[5.25 4.2 3.5 =) 


17. Verify (11) for x = [3 15 —4y" taken with the 
/,,-norm and the matrix in Prob. 13. 


18. Verify (12) for the matrices in Probs. 9 and 10. 
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PROBLEM SET 20-4 


19-20) ILL-CONDITIONED SYSTEMS 


Solve Ax = by, Ax = bg. Compare the solutions and 
comment. Compute the condition number of A. 


4.50 3.55 5.2 5.2 
19. A = > by = > bo = 
3.55 2.80 4.1 4.0 
1.7 4.7 4.7 
20. A = > by = > bo = 
1.7 1.0 2.7 2.71 


21. Residual. For Ax = b, in Prob. 19 guess what the 
residual of X =[-10.0  14.1]", very poorly approx- 
imating [—2 ay", might be. Then calculate and 
comment. 


22. Show that «(A) 2 1 for the matrix norms (10), (11), 
Sec. 20.3, and k(A) 2 Vn for the Frobenius norm (9), 
Sec. 20.3. 


23. CAS EXPERIMENT. Hilbert Matrices. The 3 < 3 
Hilbert matrix is 


1 1 

1 z 3 

= it 1 1 
H3 =| 5 3 4 
a: re 

L3 4 5 


The n Xn Hilbert matrix is Hy, = [Aj], where 
hj = [f(g +k — 1). (Similar matrices occur in 
curve fitting by least squares.) Compute the condition 
number «(H,,) for the matrix norm corresponding to 
the /..- (or /,-) vector norm, for n = 2,3,:-:,6 (or 
further if you wish). Try to find a formula that gives 
reasonable approximate values of these rapidly 
growing numbers. 

Solve a few linear systems of your choice, involving 
an H,,. 


24. TEAM PROJECT. Norms. (a) Vector norms in our 
text are equivalent, that is, they are related by double 
inequalities; for instance, 

(a) [|x|]. = [xl = allx\.. 
(18) 


1 
) a ixlh = Ix. = [xh 


Hence if for some x, one norm is large (or small), the 
other norm must also be large (or small). Thus in many 
investigations the particular choice of a norm is not 
essential. Prove (18). 


(b) The Cauchy-Schwarz inequality is 


Ixy] = [Ix[lllylle. 
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It is very important. (Proof in Ref. [GenRef7] listed 
in App. 1.) Use it to prove 


inequality. Prove that the matrix norms (10), (11) in 
Sec. 20.3 satisfy the axioms of a norm 


— 
(19a) Ixlb = [Ixlh = Vallxlk JA] = 0. 
|| Al] = 0 if and only if A = 0, 
1 _ 
(19b) yy lth = ||x|2 = [xh |KA|] = xl |All, 
e |A + Bl| = |]Al] + BI. 


(c) Formula (10) is often more practical than (9). 
Derive (10) from (9). 


(d) Matrix norms. IIlustrate (11) with examples. Give 
examples of (12) with equality as well as with strict 


25. 


WRITING PROJECT. Norms and Their Use in 
This Section. Make a list of the most important of the 
many ideas covered in this section and write a two- 
page report on them. 


20.5 Least Squares Method 


Having discussed numerics for linear systems, we now turn to an important application, 
curve fitting, in which the solutions are obtained from linear systems. 

In curve fitting we are given n points (pairs of numbers) (x1, y1),°**, (n, Yn) and we 
want to determine a function f(x) such that 


f@1) ~ yu» fOn) ~ Ya, 


approximately. The type of function (for example, polynomials, exponential functions, 
sine and cosine functions) may be suggested by the nature of the problem (the underlying 
physical law, for instance), and in many cases a polynomial of a certain degree will be 
appropriate. 

Let us begin with a motivation. 

If we require strict equality f(x1) = y1,°°-, f(*n) = yn and use polynomials of 
sufficiently high degree, we may apply one of the methods discussed in Sec. 19.3 in 
connection with interpolation. However, in certain situations this would not be the 
appropriate solution of the actual problem. For instance, to the four points 
(1) (— 1.3, 0.103), (—0.1, 1.099), (0.2, 0.808), (1.3, 1.897) 
there corresponds the interpolation polynomial f(x) = xe—x+] (Fig. 446), but if we 
graph the points, we see that they lie nearly on a straight line. Hence if these values 
are obtained in an experiment and thus involve an experimental error, and if the nature 
of the experiment suggests a linear relation, we better fit a straight line through 
the points (Fig. 446). Such a line may be useful for predicting values to be expected 
for other values of x. A widely used principle for fitting straight lines is the method 


Fig. 446. Approximate fitting of a straight line 
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of least squares by Gauss and Legendre. In the present situation it may be formulated 
as follows. 


Method of Least Squares. The straight line 
(2) y=at bx 
should be fitted through the given points (x1, y1),°**, (Xn; Yn) So that the sum of the 


squares of the distances of those points from the straight line is minimum, where 
the distance is measured in the vertical direction (the y-direction). 


The point on the line with abscissa x; has the ordinate a + bx;. Hence its distance from 
(xj, yj) 18 ly; =a> bx; (Fig. 447) and that sum of squares is 


n 
q= > (yj — a bx;)”. 
j=l 
q depends on a and b. A necessary condition for g to be minimum is 


oq 
an = -2>3'0; —a— bx;)=0 


oq 


ob 


(3) 
= -23) xj; (9; — a — bx) = 0 


(where we sum over j from | to n). Dividing by 2, writing each sum as three sums, and 
taking one of them to the right, we obtain the result 


an +b> x= dy; 
a>, 26h or b> a = > xpj- 


These equations are called the normal equations of our problem. 


(4) 


y 
(x59; 


) 


Fig. 447. Vetrical distance of a point (x;, y/) 
from a straight line y = a + bx 


EXAMPLE 1 Straight Line 
Using the method of least squares, fit a straight line to the four points given in formula (1). 


Solution. We obtain 


n=4,  Sxp=O01, Sx? = 3.43, Sy; = 3.907, Hix; = 2.3839. 
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Hence the normal equations are 


4a + 0.10b = 3.9070 


O.la + 3.43b = 2.3839. 
The solution (rounded to 4D) is a = 0.9601, b = 0.6670, and we obtain the straight line (Fig. 446) 


y = 0.9601 + 0.6670x. 1] 


Curve Fitting by Polynomials of Degree m 


Our method of curve fitting can be generalized from a polynomial y = a + bx to a 
polynomial of degree m 


(5) P(x) = bo + byx ++ + Dyx™ 


where m =n — 1. Then q takes the form 


g= > 0; - pe)? 
j=l 


and depends on m+ 1 parameters bo,-::, bm. Instead of (3) we then have m + | 
conditions 
oq oq 
6 —=0, 28, —=0 
(6) dbo dby, 


which give a system of m + 1 normal equations. 
In the case of a quadratic polynomial 


(7) P(x) = by + byx + box? 


the normal equations are (summation from | to n) 


bon + by >. xj + by >. x? = > yj 
(8) b>, GF h> x? + b>, x? = > «aj 
b>, xe + h> x? + b>, x4 = Sd xP y;. 


The derivation of (8) is left to the reader. 


EXAMPLE 2 Quadratic Parabola by Least Squares 
Fit a parabola through the data (0, 5), (2, 4), (4, 1), (6, 6), (8, 7). 


Solution. For the normal equations we need n = 5, xj = 20, Dx? = 120, yx} = 800, Sx} = 5664, 
Xyj = 23, Dxjvj = 104, DxPy; = 696. Hence these equations are 


5by + 20b, + 120b, = 23 
20by + 120b; + 800b, = 104 


120by + 800b, + 5664by = 696. 


SEC. 20.5 Least Squares Method 


875 


Solving them we obtain the quadratic least squares parabola (Fig. 448) 


y = 5.11429 — 1.41429x + 0.21429x?. 


Fig. 448. 


Least squares parabola in Example 2 


For a general polynomial (5) the normal equations form a linear system of equations in 
the unknowns bo,:--, by, When its matrix M is nonsingular, we can solve the system 
by Cholesky’s method (Sec. 20.2) because then M is positive definite (and symmetric). 
When the equations are nearly linearly dependent, the normal equations may become 
ill-conditioned and should be replaced by other methods; see [E5], Sec. 5.7, listed in 


App. I. 


The least squares method also plays a role in statistics (see Sec. 25.9). 


PROBLEM SET 2075 


1-6 


FITTING A STRAIGHT LINE 


Fit a straight line to the given points (x, y) by least squares. 
Show the details. Check your result by sketching the points 
and the line. Judge the goodness of fit. 


1. (0,2), (2,0), (3, -2), (5, —3) 


2. How does the line in Prob. 1 change if you add a point 
far above it, say, (1, 3)? Guess first. 


3. (0, 1.8), (1, 1.6), (2,1.1), (, 1.5), 


4. Hooke’s law F = ks. Estimate the spring modulus k 
from the force F [lb] and the elongation s [cm], where 
(F, s) = (1, 0.3), (2, 0.7), (4, 1.3), (6, 1.9), (10, 3.2), 
(20, 6.3). 

5. Average speed. Estimate the average speed Vay of a 
car traveling according to s = v + t [km] (s = distance 
traveled, ¢ [hr] = time) from (ft, s) = (9, 140), (10, 220), 
(11, 310), (12, 410). 

6. Ohm’s law U = Ri. Estimate R from (i, VU) = (2, 104), 
(4, 206), (6, 314), (10, 530). 


7. Derive the normal equations (8). 


(4, 2.3) 


8-11| FITTING A QUADRATIC PARABOLA 
Fit a parabola (7) to the points (x, y). Check by sketching. 


8. (-1,5), (,3), (2,4, (©, 8) 
9. (2, -3), (3,0), (5,1), (6,0) (7, -2) 
10. ¢ [hr] = Worker’s time on duty, y [sec] = His/her 


11. 


12. 


13. 


14. 


reaction time, (f, y) = (1, 2.0), (2, 1.78), (3, 1.90), 
(4, 2.35), (5, 2.70) 

The data in Prob. 3. Plot the points, the line, and the 
parabola jointly. Compare and comment. 

Cubic parabola. Derive the formula for the normal 
equations of a cubic least squares parabola. 

Fit curves (2) and (7) and a cubic parabola by least squares 
to Gy) = (~2, —30), (-1, -4), 0,4), U,4), @, 22), 
(3, 68). Graph these curves and the points on common 
axes. Comment on the goodness of fit. 

TEAM PROJECT. The least squares approximation 
of a function f(x) on an interval aSxZb by a 
function 


F(X) = aoyo(xX) + ayyiX) + 0+ + GmYm() 
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where yo(x),° +, ¥m(x) are given functions, requires the (b) Polynomial. What form does (10) take if 
determination of the coefficients ag, + ++, dy, such that F(x) = dg + ayx + +++ + Gyx™? What is the 
; coefficient matrix of (10) in this case when the interval 
©) | [f0&) ~ Fo? dx ered 
a (c) Orthogonal functions. What are the solutions of 
becomes minimum. This integral is denoted b TOV EE 208) Ss Yeni) ate cubogenal on the interyal 
If — Frel2, and lf— Fall is ime diet enue . a =x = b? (For the definition, see Sec. 11.5. See also 
ses me 7 Sec. 11.6.) 


f — Fm (L suggesting Lebesgue*). A necessary condition 
for that minimum is given by dll f — Frnl?/da; = 0, 
j = 0,--+,m [the analog of (6)]. (a) Show that this 


15. CAS EXPERIMENT. Least Squares versus Inter- 
polation. For the given data and for data of your 
choice find the interpolation polynomial and the least 


leads to m + | normal equations (j = 0,---, m) . : : ; 

squares approximations (linear, quadratic, etc.). 
m Compare and comment. 
2 hint, = bj where (a) (2,0), (-1,0, @1, (1,0), (2,0) 
(b) (—4,0), (—3,0), (—2,0), (—1,0), (0, 1), 
b (d,0), (2,0), (3,0), (4,0) 

(10) hin = | Ysa) ax, (c) Choose five points on a straight line, e.g., (0, 0), 
. Cd, 1),--:, (4,4). Move one point 1 unit upward and 
b find the quadratic least squares polynomial. Do this for 
bj = | f(x)yj(x) dx. each point. Graph the five polynomials on common 
a axes. Which of the five motions has the greatest effect? 


20.6 Matrix Eigenvalue Problems: Introduction 


We now come to the second part of our chapter on numeric linear algebra. In the first 
part of this chapter we discussed methods of solving systems of linear equations, which 
included Gauss elimination with backward substitution. This method is known as a direct 
method since it gives solutions after a prescribed amount of computation. The Gauss 
method was modified by Doolittle’s method, Crout’s method, and Cholesky’s method, 
each requiring fewer arithmetic operations than Gauss. Finally we presented indirect 
methods of solving systems of linear equations, that is, the Gauss-Seidel method and the 
Jacobi iteration. The indirect methods require an undetermined number of iterations. That 
number depends on how far we start from the true solution and what degree of accuracy 
we require. Moreover, depending on the problem, convergence may be fast or slow or our 
computation cycle might not even converge. This led to the concepts of ill-conditioned 
problems and condition numbers that help us gain some control over difficulties inherent 
in numerics. 

The second part of this chapter deals with some of the most important ideas and numeric 
methods for matrix eigenvalue problems. This very extensive part of numeric linear algebra 
is of great practical importance, with much research going on, and hundreds, if not 
thousands, of papers published in various mathematical journals (see the references in 
[E8], [E9], [E11], [E29]). We begin with the concepts and general results we shall need 
in explaining and applying numeric methods for eigenvalue problems. (For typical models 
of eigenvalue problems see Chap. 8.) 


5HENRI LEBESGUE (1875-1941), great French mathematician, creator of a modern theory of measure and 
integration in his famous doctoral thesis of 1902. 
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THEOREM -1 


An eigenvalue or characteristic value (or /atent root) of a givenn X nmatrix A = [a;x] 
is a real or complex number A such that the vector equation 


(1) Ax = Ax 

has a nontrivial solution, that is, a solution x # 0, which is then called an eigenvector or 
characteristic vector of A corresponding to that eigenvalue A. The set of all eigenvalues 
of A is called the spectrum of A. Equation (1) can be written 

(2) (A — ADx = 0 

where I is the 7 X n unit matrix. This homogeneous system has a nontrivial solution if 


and only if the characteristic determinant det (A — AI) is 0 (see Theorem 2 in Sec. 7.5). 
This gives (see Sec. 8.1) 


Eigenvalues 


The eigenvalues of A are the solutions X of the characteristic equation 


ay4—A a2 An 
24 dag —~A + dan 
G3) det (A — AT) = =o. 
anti an2 _ ann A 


Developing the characteristic determinant, we obtain the characteristic polynomial of A, 
which is of degree n in A. Hence A has at least one and at most n numerically different 
eigenvalues. If A is real, so are the coefficients of the characteristic polynomial. By familiar 
algebra it follows that then the roots (the eigenvalues of A) are real or complex conjugates 
in pairs. 

To give you some orientation of the underlying approaches of numerics for eigenvalue 
problems, note the following. For large or very large matrices it may be very difficult to 
determine the eigenvalues, since, in general, it is difficult to find the roots of characteristic 
polynomials of higher degrees. We will discuss different numeric methods for finding 
eigenvalues that achieve different results. Some methods, such as in Sec. 20.7, will give 
us only regions in which complex eigenvalues lie (Geschgorin’s method) or the intervals 
in which the largest and smallest real eigenvalue lie (Collatz method). Other methods 
compute all eigenvalues, such as the Householder tridiagonalization method and the 
QR-method in Sec. 20.9. 

To continue our discussion, we shall usually denote the eigenvalues of A by 


Ay, A2,°°°, An 
with the understanding that some (or all) of them may be equal. 


The sum of these n eigenvalues equals the sum of the entries on the main diagonal of 
A, called the trace of A; thus 


n nr 
(4) trace A = » yj = > Ar: 
j=l k=1 
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Also, the product of the eigenvalues equals the determinant of A, 
(5) det A = AjAgQ°-- Ay. 


Both formulas follow from the product representation of the characteristic polynomial, 
which we denote by f(A), 


FA) = (—D"A = AVA — Ag)-+ A = An). 


If we take equal factors together and denote the numerically distinct eigenvalues of A by 
Ay,°*:,A,(r =n), then the product becomes 


6) FA) = (DMA = ANA = Ag" = AD. 


The exponent m; is called the algebraic multiplicity of A;. The maximum number of 
linearly independent eigenvectors corresponding to A; is called the geometric multiplicity 
of Aj. It is equal to or smaller than m,. 

A subspace S of R” or C” (if A is complex) is called an invariant subspace of A if 
for every v in S the vector Av is also in S. Eigenspaces of A (spaces of eigenvectors; 
Sec. 8.1) are important invariant subspaces of A. 

Ann X n matrix B is called similar to A if there is a nonsingular n X n matrix T such that 


(7) Bea e-At. 


Similarity is important for the following reason. 


Similar Matrices 


Similar matrices have the same eigenvalues. If x is an eigenvector of A, then 
y= T ‘xisan eigenvector of B in (7) corresponding to the same eigenvalue. (Proof 
in Sec. 8.4.) 


Another theorem that has various applications in numerics is as follows. 


Spectral Shift 


If A has the eigenvalues A4,-++, Ay, then A — kI with arbitrary k has the eigenvalues 
Ay — ky, An — k. 


This theorem is a special case of the following spectral mapping theorem. 


Polynomial Matrices 


If X is an eigenvalue of A, then 
QA) = asdS + as—yAS~1 + +++ + aA + a 
is an eigenvalue of the polynomial matrix 


q(A) = a,AS + ag_yAS 1 + +++ + QyA + aol. 
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PROOF 


THEOREM 5 


Ax = Ax implies A’x = AAx = AAx = x, APx = 3x, etc. Thus 


g(A)x = (asAS + ag_yAS-1 + +++) x 


a,ASx + a,_jAS- ix + +: 


= a,ASx + ag_yAS1x + +++ = q(A) x. | 


The eigenvalues of important special matrices can be characterized as follows. 


Special Matrices 


The eigenvalues of Hermitian matrices (1.e., A‘ = A), hence of real symmetric matrices 
(i.e., Me A), are real. The eigenvalues of skew-Hermitian matrices (i.e., A =—A), 
hence of real skew-symmetric matrices | (Le., A’ = —A), are pure imaginary or 0. The 
eigenvalues of unitary matrices (i.e., A = A), hence of orthogonal matrices (1.e., 
A’ = A“4), have absolute value 1. (Proofs in Secs. 8.3 and 8.5.) 


The choice of a numeric method for matrix eigenvalue problems depends essentially on 
two circumstances, on the kind of matrix (real symmetric, real general, complex, sparse, 
or full) and on the kind of information to be obtained, that is, whether one wants to know 
all eigenvalues or merely specific ones, for instance, the largest eigenvalue, whether 
eigenvalues and eigenvectors are wanted, and so on. It is clear that we cannot enter into 
a systematic discussion of all these and further possibilities that arise in practice, but we 
shall concentrate on some basic aspects and methods that will give us a general 
understanding of this fascinating field. 


20.7 Inclusion of Matrix Eigenvalues 


THEOREM 1 


The whole of numerics for matrix eigenvalues is motivated by the fact that, except for a 
few trivial cases, we cannot determine eigenvalues exactly by a finite process because these 
values are the roots of a polynomial of nth degree. Hence we must mainly use iteration. 

In this section we state a few general theorems that give approximations and error 
bounds for eigenvalues. Our matrices will continue to be real (except in formula (5) below), 
but since (nonsymmetric) matrices may have complex eigenvalues, complex numbers will 
play a (very modest) role in this section. 

The important theorem by Gerschgorin gives a region consisting of closed circular disks 
in the complex plane and including all the eigenvalues of a given matrix. Indeed, for each 
jJ = 1,:-+-+,n the inequality (1) in the theorem determines a closed circular disk in the 
complex A-plane with center aj; and radius given by the right side of (1); and Theorem | 
states that each of the eigenvalues of A lies in one of these n disks. 


Gerschgorin’s Theorem® 


Let X be an eigenvalue of an arbitrary n X n matrix A = [aj]. Then for some 
integer j (1 Sj =n) we have 


() jay = Al S legal + lajel oe + lay geal © legged] Fee + lal 


SSEMYON ARANOVICH GERSCHGORIN (1901-1933), Russian mathematician. 
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Let x be an eigenvector corresponding to an eigenvalue A of A. Then 
(2) Ax = Ax or (A — ADx = 0. 
Let x; be a component of x that is largest in absolute value. Then we have | earl Xs =I 
for m = 1,---,n. The vector equation (2) is equivalent to a system of n equations for the 
n components of the vectors on both sides. The jth of these n equations with j as just 
indicated is 

Qj1X1 Spee ee Qj,j7-—1Xj-1 ale (aj _ A)x; + Qj, j+1Xj+1 ae ee ate Aintn = 0. 


Division by x; (which cannot be zero; why?) and reshuffling terms gives 


XY Xj-1 Xj+1 Xn 
qj,j-1 GGj+1 Xj Gin 


aij Xr aj1 Xj Xj x; . 
By taking absolute values on both sides of this equation, applying the triangle inequality 
la+b| S lal + |d| (where a and b are any complex numbers), and observing that 
because of the choice of j (which is crucial!), lx 1/x,| = 1,4, Hin 25 = 1, we obtain (1), 


and the theorem is proved. a 


Gerschgorin’s Theorem 


For the eigenvalues of the matrix 


af fl 

0 5 § 

A=|3 5 1 
1 

lz tI] 


we get the Gerschgorin disks (Fig. 449) 
D,: Center 0, radius 1, Dy: Center 5, radius 1.5, D3: Center 1, radius 1.5. 


The centers are the main diagonal entries of A. These would be the eigenvalues of A if A were diagonal. We 
can take these values as crude approximations of the unknown eigenvalues (3D-values) Ay = —0.209, 
Ag = 5.305, Az = 0.904 (verify this); then the radii of the disks are corresponding error bounds. 

Since A is symmetric, it follows from Theorem 5, Sec. 20.6, that the spectrum of A must actually lie in the 
intervals [—1, 2.5] and [3.5, 6.5]. 

It is interesting that here the Gerschgorin disks form two disjoint sets, namely, Dj U Ds, which contains two 
eigenvalues, and Ds, which contains one eigenvalue. This is typical, as the following theorem shows. 


Fig. 449. Gerschgorin disks in Example 1 
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Extension of Gerschgorin’s Theorem 


If p Gerschgorin disks form a set S that is disjoint from the n — p other disks of a 
given matrix A, then S contains precisely p eigenvalues of A (each counted with its 
algebraic multiplicity, as defined in Sec. 20.6). 


Idea of Proof. Set A = B + C, where B is the diagonal matrix with entries aj;, and 
apply Theorem | to Ay = B + ¢C with real t growing from 0 to 1. a 


Another Application of Gerschgorin’s Theorem. Similarity 


Suppose that we have diagonalized a matrix by some numeric method that left us with some off-diagonal entries 
of size 1075, say, 


2 10 107°] 
A={107> 2 107° |. 
}10-> 107° 4 


What can we conclude about deviations of the eigenvalues from the main diagonal entries? 


Solution. By Theorem 2, one eigenvalue must lie in the disk of radius 2 - 107° centered at 4 and two 
eigenvalues (or an eigenvalue of algebraic multiplicity 2) in the disk of radius 2 - 107° centered at 2. Actually, 
since the matrix is symmetric, these eigenvalues must lie in the intersections of these disks and the real axis, 
by Theorem 5 in Sec. 20.6. 

We show how an isolated disk can always be reduced in size by a similarity transformation. The matrix 


1 0 o |i 2 10> ~— 1078 J J. 0 
B=T VAT =|0 0 10-5 2 107° |} 0 0 
[0 0 107?}{ 107° ~~ 107° 4 {0 10° | 

2 107° 1] 

=| 107° 2 1 

|107"® to | 


is similar to A. Hence by Theorem 2, Sec. 20.6, it has the same eigenvalues as A. From Row 3 we get the 
smaller disk of radius 2 - 1071°. Note that the other disks got bigger, approximately by a factor of 10°. And in 
choosing T we have to watch that the new disks do not overlap with the disk whose size we want to decrease. 

For further interesting facts, see the book [E28]. |_| 


By definition, a diagonally dominant matrix A = [a;,] is ann X n matrix such that 


(3) ag) = Slee JH=lyes-yn 


k#j 


where we sum over all off-diagonal entries in Row j. The matrix is said to be strictly 
diagonally dominant if > in (3) for all 7. Use Theorem 1 to prove the following basic 


property. 


Strict Diagonal Dominance 


Strictly diagonally dominant matrices are nonsingular. 
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Further Inclusion Theorems 


An inclusion theorem is a theorem that specifies a set which contains at least one 
eigenvalue of a given matrix. Thus, Theorems | and 2 are inclusion theorems; they even 
include the whole spectrum. We now discuss some famous theorems that yield further 
inclusions of eigenvalues. We state the first two of them without proofs (which would 
exceed the level of this book). 


Schur’s Theorem’ 


Let A = laj,] be an X n matrix. Then for each of its eigenvalues d4,°+-, Ay, 
n n n 
(4) ml? = & Ad? = & S lal? (Schur’s inequality). 
i=l jelk=1 


In (4) the second equality sign holds if and only if A is such that 
(5) A‘A = AA’. 


Matrices that satisfy (5) are called normal matrices. It is not difficult to see that Hermitian, 
skew-Hermitian, and unitary matrices are normal, and so are real symmetric, skew-symmetric, 
and orthogonal matrices. 


Bounds for Eigenvalues Obtained from Schur’s Inequality 


For the matrix 


26 =2 2 


4 2 28 


we obtain from Schur’s inequality |A| = V1949 = 44.1475. You may verify that the eigenvalues are 30, 25, 
and 20. Thus 30” + 25? + 20? = 1925 < 1949; in fact, A is not normal. a 


The preceding theorems are valid for every real or complex square matrix. Other theorems 
hold for special classes of matrices only. Famous is the following one, which has various 
applications, for instance, in economics. 


Perron’s Theorem® 


Let A be a real n X n matrix whose entries are all positive. Then A has a positive 
real eigenvalue 4 = p of multiplicity 1. The corresponding eigenvector can be 
chosen with all components positive. (The other eigenvalues are less than p in 
absolute value.) 


7ISSAI SCHUR (1875-1941), German mathematician, also known by his important work in group theory. 
80SKAR PERRON (1880-1975) and GEORG FROBENIUS (1849-1917), German mathematicians, known 
for their work in potential theory, ODEs (Sec. 5.4), and group theory. 
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PROOF 


For a proof see Ref. [B3], vol. II, pp. 53-62. The theorem also holds for matrices with 
nonnegative real entries (“Perron—Frobenius Theorem”’®) provided A is irreducible, that 
is, it cannot be brought to the following form by interchanging rows and columns; here 
B and F are square and 0 is a zero matrix. 


B C 
0 F 


Perron’s theorem has various applications, for instance, in economics. It is interesting 
that one can obtain from it a theorem that gives a numeric algorithm: 


Collatz Inclusion Theorem? 


Let A = [aj] be a real n X n matrix whose elements are all positive. Let x be any 
real vector whose components X1,°**,Xy are positive, and let y1,::+,Yy be the 
components of the vector y = Ax. Then the closed interval on the real axis bounded 
by the smallest and the largest of the n quotients q; = y;/x; contains at least one 
eigenvalue of A. 


We have Ax = y or 
(6) y — Ax=0. 


The transpose A’ satisfies the conditions of Theorem 5. Hence A‘ has a positive eigenvalue 
A and, corresponding to this eigenvalue, an eigenvector u whose components u; are all 
positive. Thus A‘u = Au and by taking the transpose we obtain u'A = Au’. From this 
and (6) we have 


u(y — Ax) = uly — uAx = uly — Aux = u(y — Ax) = 0 


or written out 


n 
> uj(yj — Axj) = 0. 
j=l 


Since all the components u; are positive, it follows that 


yj — Ax; 2 0, that is, qj=a for at least one j, d 
an 


(7) 


yj — Axj = 0, that is, qj=aA for at least one j. 


Since A and A’ have the same eigenvalues, A is an eigenvalue of A, and from (7) the 
statement of the theorem follows. a 


SLOTHAR COLLATZ (1910-1990), German mathematician known for his work in numerics. 
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EXAMPLE 4 _ Bounds for Eigenvalues from Collatz’s Theorem. Iteration 


For a given matrix A with positive entries we choose an x = Xo and iterate, that is, we compute xj = Axo, 
Xg = AXj,°**, X99 = AXzg. In each step, taking x = x; and y = Ax; = x;, 1 we compute an inclusion interval 
by Collatz’s theorem. This gives (6S) 


[0.49 0.02 0.22] 1 | 0.73 | | 0.5481 | 
A=|0.02 028 0.20],x9 =|1),x1 =| 0.50], xo =| 0.3186 |, 


| 0.22 0.20 0.40 1 0.82 0.5886 


| 0.00216309 | | 0.00155743 | 
-++, X19 =| 0.00108155 |, x99 =| 0.000778713 


| 0.00216309 0.00155743 _| 


and the intervals 0.5 S A S 0.82, 0.3186/0.50 = 0.6372 S A S 0.5481/0.73 = 0.750822, etc. These intervals 
have length 


j 1 ) 3 10 15 20 


Length 0.32 0.113622 0.0539835 0.0004217 0.0000132 0.0000004 


Using the characteristic polynomial, you may verify that the eigenvalues of A are 0.72, 0.36, 0.09, so that those 
intervals include the largest eigenvalue, 0.72. Their lengths decreased with j, so that the iteration was worthwhile. 
The reason will appear in the next section, where we discuss an iteration method for eigenvalues. | 


PROBLEM—SET 20-7 


1-6 | GERSCHGORIN DISKS 9. If a symmetric n Xn matrix A = [aj] has been 
diagonalized except for small off-diagonal entries of 
size 10~°, what can you say about the eigenvalues? 


Find and sketch disks or intervals that contain the 


eigenvalues. If you have a CAS, find the spectrum and eves oy aes . 
10. Optimality of Gerschgorin disks. Illustrate with a 


compare. ; ‘ : 
2 X 2 matrix that an eigenvalue may very well lie on 
- 5 5) 4 - 5 10-2 1072 a Gerschgorin circle, so that Gerschgorin disks can 
generally not be replaced with smaller disks without 
1. | -2 0 2 2. | 1072 8 10-2 losing the inclusion property. 
> 4 7 10-2 1072 9 11. Spectral radius p(A). Using Theorem 1, show that 
L L p(A) cannot be greater than the row sum norm of A. 
0 04 —0.1 1 0 1 12-16; SPECTRAL RADIUS 
3. | -0.4 0 0.3 4.10 4 3 Use (4) to obtain an upper bound for the spectral radius: 
12. In Prob. 4 13. In Prob. 1 
= ve : EB : - 14. In Prob. 6 15. In Prob. 3 
ro & deel | i 4a =oe 16. In Prob. 5 
17. Verify that the matrix in Prob. 5 is normal. 
5. | i 3 0 6. 0.1 6 0 18. Normal matrices. Show that Hermitian, skew- 
{38 0 8 -02 0 3 Hermitian, and unitary matrices (hence real symmetric, 
7 7 skew-symmetric, and orthogonal matrices) are normal. 
7. Similarity. In Prob. 2, find T~'AT such that the radius Why is this of practical interest? 
of the Gerschgorin circle with center 5 is reduced by a 19. Prove Theorem 3 by using Theorem 1. 


factor 1/100. 
i a 20. Extended Gerschgorin theorem. Prove Theorem 2. 


8. By what integer factor can you at most reduce the Hint. Let A = B + C,B = diag (aj), Ay = B + tC, 
Gerschgorin circle with center 3 in Prob. 6? and let ¢ increase continuously from 0 to 1. 
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20.8 Power Method for Eigenvalues 


THEOREM -1 


PROOF 


A simple standard procedure for computing approximate values of the eigenvalues of an 
n Xn matrix A = [aj] is the power method. In this method we start from any vector 
Xo (# 0) with n components and compute successively 


X, = Axo, X2=AXxy, «°°, = Xs = AXs-1. 


For simplifying notation, we denote xs, by x and x; by y, so that y = Ax. 

The method applies to any n X n matrix A that has a dominant eigenvalue (a A such 
that |A| is greater than the absolute values of the other eigenvalues). If A is symmetric, it 
also gives the error bound (2), in addition to the approximation (1). 


Power Method, Error Bounds 


Let A be ann X n real symmetric matrix. Let x (# 0) be any real vector with n 
components. Furthermore, let 


_ ae eT, = ST 
y = Ax, mo = X X, my =x Vy, m,g=y Vy. 


Then the quotient 


my 


(1) Gees (Rayleigh’® quotient) 


is an approximation for an eigenvalue A of A (usually that which is greatest in 
absolute value, but no general statements are possible). 
Furthermore, if we set q = A — €, so that € is the error of q, then 


re eee 
(2) ele ne 8 


5” denotes the radicand in (2). Since m1, = gmg by (1), we have 


(3) (y — gx)"(y — gx) = mg — 2gm, + q?mg = mz — q?mo = & mo. 
Since A is real symmetric, it has an orthogonal set of n real unit eigenvectors Z1,-*+, Zn, 
corresponding to the eigenvalues A1,---, A,, respectively (some of which may be equal). 


(Proof in Ref. [B3], vol. 1, pp. 270-272, listed in App. 1.) Then x has a representation of 
the form 


KX = QyZ1 +++ + AyZy- 


10] ORD RAYLEIGH (JOHN WILLIAM STRUTT) (1842-1919), great English physicist and mathematician, 
professor at Cambridge and London, known for his important contributions to various branches of applied 
mathematics and theoretical physics, in particular, the theory of waves, elasticity, and hydrodynamics. In 1904 
he received a Nobel Prize in physics. 
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Now Az, = AqZj, etc., and we obtain 
y = AX = a44AqZ1 + +++ + AyAnZn 

and, since the z; are orthogonal unit vectors, 
(4) mo =x'x =a} t+-- + a2. 
It follows that in (3), 

Y — gx = ay(Ay — Qty + +7* + An(An — QEn- 
Since the z; are orthogonal unit vectors, we thus obtain from (3) 
(5) 5m = (y — qx)"(¥ — 4x) = ay — g) +00 + ann — 


Now let A, be an eigenvalue of A to which g is closest, where c suggests “closest.” Then 
(Ac — gS (Aj — = for j = 1,-++,n. From this and (5) we obtain the inequality 


5’mo = (A, — q°(at ote oases. S22 wy = (A, - q)"mo. 


Dividing by mo, taking square roots, and recalling the meaning of & gives 


This shows that 6 is a bound for the error € of the approximation q of an eigenvalue of 
A and completes the proof. a 


The main advantage of the method is its simplicity. And it can handle sparse matrices 
too large to store as a full square array. Its disadvantage is its possibly slow convergence. 
From the proof of Theorem | we see that the speed of convergence depends on the ratio 
of the dominant eigenvalue to the next in absolute value (2:1 in Example 1, below). 

If we want a convergent sequence of eigenvectors, then at the beginning of each step 
we scale the vector, say, by dividing its components by an absolutely largest one, as in 
Example 1, as follows. 


Application of Theorem 1. Scaling 


For the symmetric matrix A in Example 4, Sec. 20.7, andx9 = [1 1 1]" we obtain from (1) and (2) and the 
indicated scaling 


[0.49 0.02 0.22] 1 | 0.890244 | | 0.931193 | 
A=|0.02 0.28 0.20|, xo ={1], x1 =| 0.609756], x. =| 0.541284 


[0.22 0.20 0.40 1 1 1 


0.990663 | | 0.999707 | 0.999991 | 
X5 = 0.504682 |, X10 = 0.500146 | , X15 > 0.500005 |. 


1 1 1 
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Apply the power method without scaling (3 steps), using 
1 1)". Give Rayleigh quotients and 


xo =[l, lJ’ or[l 


887 


Here Axy = [0.73 0.5 0.82)", scaled to x, = [0.73/0.82 0.5/0.82 Wy, etc. The dominant eigenvalue is 
0.72, an eigenvector [1 0.5 1]'. The corresponding g and 6 are computed each time before the next scaling. 
Thus in the first step, 


my  xgAxg — 2.05 


0.683333 


ga(™ »\? _ (1.4553 
mo q 3 


This gives the following values of g, 6, and the error € = 0.72 — q (calculations with 10D, rounded to 6D): 


mo xd Xo 


“J ; —— 
q 


1/2 
.°) 0.134743. 
XOX 


j 1 2 5 10 

q 0.683333 0.716048 0.719944 0.720000 
5 0.134743 0.038887 0.004499 0.000141 
€ 0.036667 0.003952 0.000056 5+ 10-8 


The error bounds are much larger than the actual errors. This is typical, although the bounds cannot be improved; 
that is, for special symmetric matrices they agree with the errors. 

Our present results are somewhat better than those of Collatz’s method in Example 4 of Sec. 20.7, at the 
expense of more operations. | 


Spectral shift, the transition from A to A — KI, shifts every eigenvalue by —k. Although 
finding a good k can hardly be made automatic, it may be helped by some other method 
or small preliminary computational experiments. In Example 1, Gerschgorin’s theorem 
gives —0.02 = A S 0.82 for the whole spectrum (verify!). Shifting by —0.4 might be too 
much (then —0.42 = A S 0.42), so let us try —0.2. 


Power Method with Spectral Shift 


For A — 0.2I with A as in Example 1 we obtain the following substantial improvements (where the index | 
refers to Example | and the index 2 to the present example). 


j 1 2 5 10 
Oi: 0.134743 0.038887 0.004499 0.000141 
8, 0.134743 0.034474 0.000693 18-1078 
€, 0.036667 0.003952 0.000056 5-10-8 
€, 0.036667 0.002477 1.31078 9-10-22 a 


error bounds. Show the details of your work. 


yl 


5-8 


=1 1 
3 Ps 
2 3 


4. | -1.8 2.8 


PROBLEM SET 20-8 


POWER METHOD WITHOUT SCALING 


| 36-18 18 
~2.6 
18 -26 28 


POWER METHOD WITH SCALING 


Apply the power method (3 steps) with scaling, using 


Xo = [1 


1 yor 1 


1 1)", as applicable. Give 
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CHAP. 20 Numeric Linear Algebra 


12. CAS EXPERIMENT. Power Method with 


your work. 
5. The matrix in Prob. 3 
[4 2 3 
6. | 2 7 6 
| 3 6 4 
[5 1 0 0 
1 3 1 0 
7. 
0 1 3 1 
| 0 0 i 5 
[2 4 0 1 
4 1 2 8 
8. 
0 2 5 2 
[1 8 2 0 
9. Prove that if x is an eigenvector, then 6 = 0 in (2). 


10. 


11. 


Give two examples. 


Rayleigh quotient. Why does g generally approximate 
the eigenvalue of greatest absolute value? When will 
q be a good approximation? 

Spectral shift, smallest eigenvalue. In Prob. 3 set 
B = A — 3] (as perhaps suggested by the diagonal 
entries) and see whether you may get a sequence of q’s 
converging to an eigenvalue of A that is smallest (not 
largest) in absolute value. Use xp = [1 1 i. Do 
8 steps. Verify that A has the spectrum {0, 3, 5}. 


Scaling. Shifting. (a) Write a program for n Xn 
matrices that prints every step. Apply it to the 
(nonsymmetric!) matrix (20 steps), starting from 
foi ay. 


. 15 «12 3 
A=| 18 44 18 
Ae ag =e 


(b) Experiment in (a) with shifting. Which shift do you 
find optimal? 
(c) Write a program as in (a) but for symmetric matrices 
that prints vectors, scaled vectors, g, and 6. Apply it to 
the matrix in Prob. 8. 
0.6 0.8 
and 


(d). Optimality of 6. Consider A = 
0.8 —0.6 


3 
take xg = 


| . Show that g = 0, 6 = 1 for all steps 
= 

and the eigenvalues are +1, so that the interval 
[q — 6, q + 6] cannot be shortened (by omitting +1) 
without losing the inclusion property. Experiment with 
other xq’s. 

(e) Find a (nonsymmetric) matrix for which 6 in (2) is 
no longer an error bound. 


(f) Experiment systematically with speed of conver- 
gence by choosing matrices with the second greatest 
eigenvalue (i) almost equal to the greatest, (ii) some- 
what different, (iii) much different. 


20.9 Tridiagonalization and QR-Factorization 


We consider the problem of computing all the eigenvalues of a real symmetric matrix 
A = [aj], discussing a method widely used in practice. In the first stage we reduce the 
given matrix stepwise to a tridiagonal matrix, that is, a matrix having all its nonzero 
entries on the main diagonal and in the positions immediately adjacent to the main diagonal 
(such as Ag in Fig. 450, Third Step). This reduction was invented by A. S. Householder'! 
(J. Assn. Comput. Machinery 5 (1958), 335-342). See also Ref. [E29] in App. 1. 

This Householder tridiagonalization will simplify the matrix without changing its 
eigenvalues. The latter will then be determined (approximately) by factoring the tridiago- 
nalized matrix, as discussed later in this section. 


NALSTON SCOTT HOUSEHOLDER (1904-1993), American mathematician, known for his work in 
numerical analysis and mathematical biology. He was head of the mathematics division at Oakridge National 
Laboratory and later professor at the University of Tennessee. He was both president of ACM (Association for 
Computing Machinery) 1954-1956 and SIAM (Society for Industrial and Applied Mathematics) 1963-1964. 
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Householder’s Tridiagonalization Method" 


Ann X nreal symmetric matrix A = [a,,] being given, we reduce it by n — 2 successive 
similarity transformations (see Sec. 20.6) involving matrices P,,---, P,,_2 to tridiagonal 
form. These matrices are orthogonal and symmetric. Thus Py l= p/ = P, and similarly 
for the others. These transformations produce, from the given Ag = A = [a;;,], the matrices 


Ay = [aR], Ao = [a ],--+, An—2 = [aSf~] in the form 
Ay = PyAoP, 
Ag = PoAyPo 


() 


B = An-2 = Py-2An—3Pn-2. 


The transformations (1) create the necessary zeros, in the first step in Row | and Column 1, 
in the second step in Row 2 and Column 2, etc., as Fig. 450 illustrates fora 5 X 5 matrix. 
B is tridiagonal. 


* Ok ok * 3k 
* ck ok ck ok * ok ok * ck Gk 
* ok ok ok * Kk Ck ok * Gk ook 
* ok ok ok * ok ok * ok 
* ok ok kk ok * ok 
First Step Second Step Third Step 
A,=P,AP, A,=P,A,P, A, =P,A.P, 


Fig. 450. Householder’s method for a5 X 5 matrix. 
Positions left blank are zeros created by the method. 


How do we determine Pj, Ps,---, P,-2? Now, all these P,. are of the form 
(2) P, =I — 2v,v) (r = 1,-:-,n — 2) 


where Tis the n X n unit matrix and v, = [v;,] is a unit vector with its first r components 


0; thus 
0 0 0 
* 0 0 
(3) vi =! *I, vo =| * |, sid. —— 
* 
* * * 


where the asterisks denote the other components (which will be nonzero in general). 
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Step 1. v1 has the components 


U141 = 0 


(a) 
4 b aj1 SEN ag1 : a 
i= =3,4---, 
(4) (b) ae j n 
where 
(©) S$, = Vad) + a3) + 0 + Gey 
where S$; > 0, and sgn dg; = +1 if dg, 2 O and sgn ao, = —1 if dg, < 0. With this we 


compute P, by (2) and then A, by (1). This was the first step. 


Step 2. We compute v2 by (4) with all subscripts increased by 1 and the aj, replaced 


by ake, the entries of Aj just computed. Thus [see also (3)] 


Vi2 = Vag = 0 


() 
(4*) U32 = L (1 % lags ) 
2 So 
() qd) 
a2 Sgn ago . 45 
U;9 = ———__- =4,5,---,n 
pe 2032S2 J 
where 
2 2 2 
Sp = Vay + aQ? + --- + a. 


With this we compute Py, by (2) and then Ag by (1). 


Step 3. We compute vg by (4*) with all subscripts increased by 1 and the ayy replaced 


by the entries ap of Ay, and so on. 


Householder Tridiagonalization 


Tridiagonalize the real symmetric matrix 


Ee D 
no ££ 
= 
a 


Solution. Step 1. We compute S? = 47 + 1? + 1? = 18 from (4c). Since dg, = 4 > 0, we have sgn ag, = +1 
in (4b) and get from (4) by straightforward computation 
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0 0) 
Vor 0.98559856 
| oa, | | 011957316 
Var 0.11957316 
From this and (2), 
1 0 0 0 


—0.94280904 —0.23570227 —0.23570227 
Py 


Se: Oo). ©: 


—0.23570227 0.97140452 —0.02859548 
—0.23570227 —0.02859548 0.97140452 


From the first line in (1) we now get 


6 —Vv18 0 0 


-VI8 7 =I. 1 
Ay = PyAoP; = 4 = : 3 
0 - 3 3 
Step 2. From (4*) we compute S3 = 2 and 
0 0 
0 0) 
Vo = = 
U39 0.92387953 
Vag 0.38268343 
From this and (2), 
1 0 0 0 


0 1 0 0 
0 0 -1/V¥2 -1/v2| 
0 0 -I/V2 -1/V2 


Py 


The second line in (1) now gives 


6 -VI18 0 0 

—VI18 q, 2 0 

By = Az = PyAyPp = ; a a 
0 0 0 3 


This matrix B is tridiagonal. Since our given matrix has order n = 4, we needed n — 2 = 2 steps to accomplish 
this reduction, as claimed. (Do you see that we got more zeros than we can expect in general?) 

B is similar to A, as we now show in general. This is essential because B thus has the same spectrum as A, 
by Theorem 2 in Sec. 20.6. Bo 


B Similar to A. We assert that B in (1) is similar to A = Ag. The matrix P,. is symmetric; 
indeed, 


P,’ = 1 — 2v,v,)' =I" — 20,v,")' = 1 - 2v,v,! = P, 
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Also, P,. is orthogonal because vy, is a unit vector, so that V7 Vp = | and thus 


PP,! = P,? = (1 — 2v,v,!)? = 1 - 4v,v," + 4v,Vv, VV, 


=I — 4y,v,' + 4y,(v,/v,)v, =I. 
Hence P, a Pt = P,. and from (1) we now obtain 


B =F, .5A,,_4P, 9 = +3 

- =P,,_»P,,-3::: PAP, --- P,,_3P,_2 
= PoP a: PT AP) ** BP, =3P23 
= PAP 


where P = P,P5:--P,,~2. This proves our assertion. @ 


QR-Factorization Method 


In 1958 H. Rutishauser’? of Switzerland proposed the idea of using the LU-factorization 
(Sec. 20.2; he called it LR-factorization) in solving eigenvalue problems. An improved 
version of Rutishauser’s method (avoiding breakdown if certain submatrices become 
singular, etc.; see Ref. [E29]) is the QR-method, independently proposed by the American 
J. G. F. Francis (Computer J. 4 (1961-62), 265-271, 332-345) and the Russian V. N. 
Kublanovskaya (Zhurnal Vych. Mat. i Mat. Fiz. 1 (1961), 555-570). The QR-method uses 
the factorization QR with orthogonal Q and upper triangular R. We discuss the QR-method 
for a real symmetric matrix. (For extensions to general matrices see Ref. [E29] in App. 1.) 

In this method we first transform a given real symmetric n X n matrix A into a 
tridiagonal matrix Bp = B by Householder’s method. This creates many zeros and thus 
reduces the amount of further work. Then we compute By, By,--- stepwise according to 
the following iteration method. 


Step 1. Factor Bp = QoRo with orthogonal Qo and upper triangular Ro. Then compute 
By, = RoQo. 
Step 2. Factor B; = Q,R,. Then compute By = R1Q}. 

General Step s + 1. 


(a) Factor B, = Q,Rsg. 


(5) 
(b) Compute B,+1 = R;Qs. 


Here Q, is orthogonal and R, upper triangular. The factorization (Sa) will be explained 
below. 


B,..1 Similar to B. Convergence to a Diagonal Matrix. From (5a) we have R, = Q; ‘Bg. 
Substitution into (Sb) gives 


(6) Boat = R,Q; = Q; BQ. 


HEINZ RUTISHAUSER (1918-1970). Swiss mathematician, professor at ETH Zurich. Known for his 
pioneering work in numerics and computer science. 
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Thus B,,1 is similar to B,. Hence B,+1 is similar to Bp = B for all s. By Theorem 2, Sec. 
20.6, this implies that B,,; has the same eigenvalues as B. 

Also, B,,; is symmetric. This follows by induction. Indeed, Bp = B is symmetric. 
Assuming B, to be symmetric, that is, B,' = B,, and using Q;!=Q,! (since Q, is 
orthogonal), we get from (6) the symmetry, 


Bs+ ul = (Q,'B,Q,)" = Q,'B,'Q, = Q,'B,Q, = By +1. 


If the eigenvalues of B are different in absolute value, say, Aq > |As| > es SS lAn|. 


then 
lim B, = D 


where D is diagonal, with main diagonal entries Ay, Ag, --- , Ay. (Proof in Ref. [E29] listed 
in App. 1.) 


How to Get the QR-Factorization, say, B = Bo = [bj] = QoRo. The tridiagonal matrix 
B has n-— 1 generally nonzero entries below the main diagonal. These are 
ba3, b32,°++, by n—1. We multiply B from the left by a matrix Cy such that CoB = [by] 
has b§ = 0. We multiply this by a matrix C3 such that C3sCoB = [DY] has bS = 0, 
etc. After n — 1 such multiplications we are left with an upper triangular matrix Ro, 
namely, 


(7) C,Cy-1 ++: C3C2Bo = Ro. 
These n X n matrices C; are very simple. C; has the 2 X 2 submatrix 
cos 0; sin 0; 
(0; suitable) 
—sin 6; cos 6; 
in Rows j — 1 andj and Columns j — 1 and j; everywhere else on the main diagonal the 
matrix C; has entries 1; and all its other entries are 0. (This submatrix is the matrix of a 


plane rotation through the angle 6;; see Team Project 30, Sec. 7.2.) For instance, ifn = 4, 
writing cj = cos 6;, s; = sin 0;, we have 


co se 0 0 1 0 oO O 1 0 0 0 
so Cc, 0 0 0 c3 S3 0 0 1 0 0 

Co , C3 = , C4 = 
0 0 1 0 0 et | cg 0 0 0 C4 S4 
0 oOo oOo 1 0 oOo oOo 1 0 0 -sa 4 


These C; are orthogonal. Hence their product in (7) is orthogonal, and so is the inverse 
of this product. We call this inverse Qg. Then from (7), 


(8) Bo = QoRo 
where, with Cc =C 2 


(9) Qo = (CyCy—1°++ CgCo)~* = Ca"Cg"+++ Cy 'Cy". 
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This is our QR-factorization of Bo. From it we have by (5b) with s = 0 
(10) By = RoQo = RoC2'C3" ++: Cy—1'Cn’. 


We do not need Qo explicitly, but to get B, from (10), we first compute RoC." then 
(RoC3)C3, etc. Similarly in the further steps that produce Bo, Bs,::-. 


Determination of cos 0; and sin @;._ We finally show how to find the angles of rotation. 


cos 85 and sin 02 in Cy must be such that oS = 0 in the product 


ro) So 0 fet |) Dig Dy big 


Se C2 0 se | | Boy bog bog 


Now b& is obtained by multiplying the second row of Cg by the first column of B, 
a | = —Sob44 + Cobo = —(sin O5)b44 + (cos 65)bo1 = 0. 


Hence tan 02 = Se/c2 = be1/by1, and 


1 1 
COS Og = = 
V1 + tan20. V1 + (bo3/b11)" 
(11) 
tan 05 bo3/by1 
sin 0g = = : 
Vi1+tan?6, V1 + (bo1/b1)* 
Similarly for 63, 64,---. The next example illustrates all this. 


EXAMPLE 2. QR-Factorization Method 


Compute all the eigenvalues of the matrix 


Solution. We first reduce A to tridiagonal form. Applying Householder’s method, we obtain (see Example 1) 


6 —VI8 0 0 
—VI8 7 v2 ~«O 

i 0 v2 6 Of. 
0 0 0 3 
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From the characteristic determinant we see that Ag, hence A, has the eigenvalue 3. (Can you see this directly 
from Az?) Hence it suffices to apply the QR-method to the tridiagonal 3 3 matrix 


6 -—VIB8 0 
Bo = B=| -VI8 7 V2}. 
0 v2 6 
Step 1. We multiply B from the left by 
| cos 02 sin 05 0| [4 0 0] 
Cy =| —sin 65 cos 62 (0) and then CoB by C3 =|]0 cos 63 sin 03 
| O 0 1| [0 —sin 03 cos 63 | 


Here (—sin 62) - 6 + (cos 62)(—V 18) = 0 gives (11) cos 62 = 0.81649658 and sin 65 


these values we compute 


0.57735027. With 


| 734846923 —7.50555350 —0.81649658 ] 
CoB =| 0 3.26598632 1.15470054 |. 
[0 1.41421356 6.00000000 | 


In C3 we get from (—sin 63) - 3.26598632 + (cos 03) - 1.41421356 = 0 the values cos 63 = 0.91766294 and 
sin 63 = 0.39735971. This gives 


| 7.34846923 —7.50555350 —0.81649658 ] 
Ro = C3CoB =| 0 3.55902608  3.44378413 
| 0 0 5.04714615 | 


From this we compute 


| 10.33333333 —2.05480467 0 ] 
B, = RoC2'C3' =| —2.05480467 — 4.03508772 ~—-2.00553251 
| 0 2.00553251  4.63157895 | 


which is symmetric and tridiagonal. The off-diagonal entries in By are still large in absolute value. Hence we 
have to go on. 


Step 2. We do the same computations as in the first step, with Bp = B replaced by B, and Cz and C3 changed 
accordingly, the new angles being 02 = —0.196291533 and 63 = 0.513415589. We obtain 


| 10.53565375 —2.80232241 —0.39114588 | 
R, =| 0 4.08329584  3.98824028 
| 0 0 3.06832668 | 
and from this 
| 10.87987988 —0.79637918 0 7) 
By =| —0.79637918  5.44738664 —‘1.50702500 |. 
| 0 150702500 2.67273348 | 


We see that the off-diagonal entries are somewhat smaller in absolute value than those of By, but still much too 
large for the diagonal entries to be good approximations of the eigenvalues of B. 
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Fi urther Steps. We list the main diagonal entries and the absolutely largest off-diagonal entry, which is 
|p| = los? | in all steps. You may show that the given matrix A has the spectrum 11, 6, 3, 2. 


Step j bY? by by max jer [Dyn 
3 10.9668929 5.94589856 2.0872085 1 0.58523582 
5 10.9970872 6.00181541 2.00109738 0.12065334 
7 10.9997421 6.00024439 2.00001355 0.03591107 
9 10.9999772 6.00002267 2.00000017 0.01068477 ii 


Looking back at our discussion, we recognize that the purpose of applying Householder’s 
tridiagonalization before the QR-factorization method is a substantial reduction of cost in 
each QR-factorization, in particular if A is large. 

Convergence acceleration and thus further reduction of cost can be achieved by a 
spectral shift, that is, by taking B, — k;I instead of B, with a suitable k,. Possible choices 
of k, are discussed in Ref. [E29], p. 510. 


PROBLEM SET 2079 


1-5| HOUSEHOLDER TRIDIAGONALIZATION 3 52 10 42 
Tridiagonalize. Show the details. 52 59 A4 80 
= 5. 
0.98 0.04 0.44 10 44 39 42 
1. | 0.04 0.56 0.40 42 80 42 35 
| 0.44 0.40 0.80 6-9 | QR-FACTORIZATION 
ro 1 1 Do three QR-steps to find approximations of the eigen- 
values of: 
2.) 1 0 1 6. The matrix in the answer to Prob. 1 
l l 0 7. The matrix in the answer to Prob. 3 
r 14.2 —0.1 0 140 10 0 
7 2 
8.| -0.1 —6.3 0.2 9.| 10 70 2 
3..| 2 10 
3 6 0 0.2 2.1 0 2 —30 
7 10. CAS EXPERIMENT. QR-Method. Try to find out 
5 4 1 1 experimentally on what properties of a matrix the speed 
of decrease of off-diagonal entries in the QR-method 
4 # 5 1 1 depends. For this purpose write a program that first 
S l l 4 2 tridiagonalizes and then does QR-steps. Try the 
program out on the matrices in Probs. 1, 3, and 4. 
1 1 2 4 Summarize your findings in a short report. 


CHAP TER-20 REVIEW QUESTIONS AND PROBLEMS 


1. What are the main problem areas in numeric linear 3. What is pivoting? Why and how is it done? 
algebra? 


4. What happens if you apply Gauss elimination to a 


2. When would you apply Gauss elimination and when system that has no solutions? 
Gauss-Seidel iteration? 5. What is Cholesky’s method? When would you apply it? 
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6. What do you know about the convergence of the [5 1 1 
Gauss-Seidel iteration? 

7. What is ill-conditioning? What is the condition number 20. | 1 6 0 
and its significance? 1 0 8 


8. Explain the idea of least squares approximation. 


9. What are eigenvalues of a matrix? Why are they 21-23! GAUSS-—SEIDEL ITERATION 
important? Give typical examples. 


Do 3 st ithout scaling, starting from [1 1 1)", 
10. How did we use similarity transformations of matrices POE A nM ene te ant ] 


in designing numeric methods? 21. 4x1 — x2 = 22.0 
11. What is the power method for eigenvalues? What are 


its advantages and disadvantages? cs a 


12. State Gerschgorin’s theorem from memory. Give typical =X 4 + 4x3 = —2.4 
applications. 
13. What is tridiagonalization and QR? When would you Da Aa OAs 220 
apply it? 0.5x1 — 0.2x9 + 2.5x3 = —5.1 
14-17| GAUSS ELIMINATION 7.5x1 + O.lxg — 1.5xg3 = —12.7 
Solve 23. 10x4 We xg — x3 > 17 
14. 3xg — 6x3 = 0 2x1 + 20x92 x3 = 28 
4x4 = Xe + 2x3 = 16 3x4 = xXgvr 25x3 = 105 
5x1 + 2x2 — 4x3 = —20 24-26 | VECTOR NORMS 
15. 8x29 — 6x3 = 23.6 Compute the ¢-, f2-, and €..-norms of the vectors. 


24. [0.2 -8.1 04 0 0 -13 2] 


25. [8 —21 13 oj" 
12x, — 14x29 + 4x3 = —6.2 26.[(0 0 0 -1 oy 


10x4 a 6X2 a 2x3 = 68.4 


16. 5x4 a xo — 3x3 = 7 
27-30 | MATRIX NORM 


aa tag) Compute the matrix norm corresponding to the ¢,,-vector 


2x1 —3xe + 9x3 = 0 norm for the coefficient matrix: 
27. In Prob. 15 
17. 42x41 - TAx9 + 36x3 = 96 28. In Prob. 17 
46x, — 12x95 2x3 = 82 29. In Prob. 21 
0. In Prob. 22 
3x41 Tr 25x9 + 5x3 = 19 : oe 


31-33 | CONDITION NUMBER 


18-20} INVERSE MATRIX 


Compute the condition number (corresponding to the 
Compute the inverse of: €.,-vector norm) of the coefficient matrix: 


31. In Prob. 19 
32. In Prob. 18 
18. | 1.6 4.4 0.5 33. In Prob. 21 


[26° @4 33 


10.3 -43 28 


34-35 | FITTING BY LEAST SQUARES 


15 20 10 Fit and graph: 

34. A straight line to (—1,0), (0,2), (1,2), (2,3), 
(3, 3) 

10 15 90 35. A quadratic parabola to the data in Prob. 34. 


19. | 20 35 15 
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Find and graph three circular disks that must contain all the 
eigenvalues of the matrix: 


36. In Prob. 18 
37. In Prob. 19 


36-39 | EIGENVALUES 38. In Prob. 20 


39. Of the coefficients in Prob. 14 

40. Power method. Do 4 steps with scaling for the matrix 
in Prob. 19, starting for[1 1 1] and computing the 
Rayliegh quotients and error bounds. 


SUMMARY-OF-CHAPTER-2-0 


Numeric Linear Algebra 


Main tasks are the numeric solution of linear systems (Secs. 20.1—20.4), curve fitting 
(Sec. 20.5), and eigenvalue problems (Secs. 20.6—20.9). 
Linear systems Ax = b with A = [aj], written out 


Ey: 441X141 + 8 + AyXy = by 


Es: dgyXy t+ t+ + danXyn = bo 


() 


Eni QyiX1t +t+ + Gantn = bn 


can be solved by a direct method (one in which the number of numeric operations 
can be specified in advance, e.g., Gauss’s elimination) or by an indirect or iterative 
method (in which an initial approximation is improved stepwise). 

The Gauss elimination (Sec. 20.1) is direct, namely, a systematic elimination 
process that reduces (1) stepwise to triangular form. In Step 1 we eliminate x1 from 
equations Es to E,, by subtracting (a21/a11) E, from Eg, then (431/11) E; from 
Es, etc. Equation Ej is called the pivot equation in this step and ay, the pivot. In 
Step 2 we take the new second equation as pivot equation and eliminate xg, etc. If 
the triangular form is reached, we get x, from the last equation, then x,—, from 
the second last, etc. Partial pivoting (= interchange of equations) is necessary if 
candidates for pivots are zero, and advisable if they are small in absolute value. 

Doolittle’s, Crout’s, and Cholesky’s methods in Sec. 20.2 are variants of the 
Gauss elimination. They factor A = LU (L lower triangular, U upper triangular) 
and solve Ax = LUx = b by solving Ly = b for y and then Ux = ch for x. 

In the Gauss-Seidel iteration (Sec. 20.3) we make aj, A Ann = | 
(by division) and write Ax = (I + L + U)x = b; thus x = b — (L + U)x, which 
suggests the iteration formula 


(2) xt =b-— Lx@tb _ Ux™ 


in which we always take the most recent approximate x,’s on the i If ||C|| < 1, 
where C = —(1 + L)~1U, then this process converges. 
matrix norm (Sec. 20.3). 
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If the condition number k(A) = ||A\j ||A~ “| of A is large, then the system Ax = b 
is ill-conditioned (Sec. 20.4), and a small residual r = b — AX does not imply 
that x is close to the exact solution. 

The fitting of a polynomial p(x) = bg + byx + ++: + b,x” through given data 
(points in the xy-plane) (x1, yi), +, (%n, Yn) by the method of least squares is 
discussed in Sec. 20.5 (and in statistics in Sec. 25.9). If m = n, the least squares 
polynomial will be the same as an interpolating polynomial (uniqueness). 

Eigenvalues A (values A for which Ax = Ax has a solution x # 0, called an 
eigenvector) can be characterized by inequalities (Sec. 20.7), e.g. in Gerschgorin’s 
theorem, which gives n circular disks which contain the whole spectrum (all 
eigenvalues) of A, of centers aj; and radii > |ajx.| (sum over k from | to n, k # j). 

Approximations of eigenvalues can be obtained by iteration, starting from an 
Xo # 0 and computing x; = Axg, Xo = AXy,°-:,X, = AXx,-_,. In this power 
method (Sec. 20.8) the Rayleigh quotient 


(Ax)")x 
(3) q=— e=x,) 
xX X 


gives an approximation of an eigenvalue (usually that of the greatest absolute value) 
and, if A is symmetric, an error bound is 


(4) lel < /(Ax)Ax ~ q?. 
xX X 


Convergence may be slow but can be improved by a spectral shift. 

For determining all the eigenvalues of a symmetric matrix A it is best to first 
tridiagonalize A and then to apply the QR-method (Sec. 20.9), which is based on a 
factorization A = QR with orthogonal Q and upper triangular R and uses similarity 
transformations. 


CHAPTER 2 | 


P) Numerics for ODEs and PDEs 


Ordinary differential equations (ODEs) and partial differential equations (PDEs) play a 
central role in modeling problems of engineering, mathematics, physics, aeronautics, 
astronomy, dynamics, elasticity, biology, medicine, chemistry, environmental science, 
economics, and many other areas. Chapters 1-6 and 12 explained the major approaches 
to solving ODEs and PDEs analytically. However, in your career as an engineer, applied 
mathematicians, or physicist you will encounter ODEs and PDEs that cannot be solved 
by those analytic methods or whose solutions are so difficult that other approaches are 
needed. It is precisely in these real-world projects that numeric methods for ODEs and 
PDEs are used, often as part of a software package. Indeed, numeric software has become 
an indispensable tool for the engineer. 

This chapter is evenly divided between numerics for ODEs and numerics for PDEs. 
We start with ODEs and discuss, in Sec. 21.1, methods for first-order ODEs. The main 
initial idea is that we can obtain approximations to the solution of such an ODE at points 
that are a distance h apart by using the first two terms of Taylor’s formula from calculus. 
We use these approximations to construct the iteration formula for a method known as 
Euler’s method. While this method is rather unstable and of little practical use, it serves 
as a pedagogical tool and a starting point toward understanding more sophisticated methods 
such as the Runge—Kutta method and its variant the Runga—Kutta—Fehlberg (RKF) method, 
which are popular and useful in practice. As is usual in mathematics, one tends to 
generalize mathematical ideas. The methods of Sec. 21.1 are one-step methods, that is, 
the current approximation uses only the approximation from the previous step. Multistep 
methods, such as the Adams—Bashforth methods and Adams—Moulton methods, use values 
computed from several previous steps. We conclude numerics for ODEs with applying 
Runge—Kutta—Nystr6m methods and other methods to higher order ODEs and systems of 
ODEs. 

Numerics for PDEs are perhaps even more exciting and ingenious than those for ODEs. 
We first consider PDEs of the elliptic type (Laplace, Poisson). Again, Taylor’s formula 
serves as a starting point and lets us replace partial derivatives by difference quotients. 
The end result leads to a mesh and an evaluation scheme that uses the Gauss—Seidel 
method (here also know as Liebmann’s method). We continue with methods that use grids 
to solve Neuman and mixed problems (Sec. 21.5) and conclude with the important 
Crank—Nicholson method for parabolic PDEs in Sec. 21.6. 

Sections 21.1 and 21.2 may be studied immediately after Chap. 1 and Sec. 21.3 
immediately after Chaps. 2-4, because these sections are independent of Chaps. 19 and 20. 

Sections 21.4-21.7 on PDEs may be studied immediately after Chap. 12 if students 
have some knowledge of linear systems of algebraic equations. 


Prerequisite: Secs. 1.1-1.5 for ODEs, Secs. 12.1—12.3, 12.5, 12.10 for PDEs. 
References and Answers to Problems: App. | Part E (see also Parts A and C), App. 2. 
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21.1 Methods for First-Order ODEs 


Take a look at Sec. 1.2, where we briefly introduced Euler’s method with an example. 
We shall develop Euler’s method more rigorously. Pay close attention to the derivation 
that uses Taylor’s formula from calculus to approximate the solution to a first-order ODE 
at points that are a distance / apart. If you understand this approach, which is typical for 
numerics for ODEs, then you will understand other methods more easily. 

From Chap. 1 we know that an ODE of the first order is of the form F(x, y, y’) = 0 
and can often be written in the explicit form y’ = f(x, y). An initial value problem for 
this equation is of the form 


(1) y =f(x,y), yo) = yo 


where x9 and yo are given and we assume that the problem has a unique solution on some 
open interval a < x < b containing xo. 

In this section we shall discuss methods of computing approximate numeric values of 
the solution y(x) of (1) at the equidistant points on the x-axis 


Xy=Xg tA, Xg = Xo + 2h, x3 = Xo + 3h, 


where the step size / is a fixed number, for instance, 0.2 or 0.1 or 0.01, whose choice we 
discuss later in this section. Those methods are step-by-step methods, using the same 
formula in each step. Such formulas are suggested by the Taylor series 


! nh? ” 
(2) ye +h) = y(x) + hy’) + SP y"@) + 


Formula (2) is the key idea that lets us develop Euler’s method and its variant called— 
you guessed it—improved Euler method, also known as Heun’s method. Let us start by 
deriving Euler’s method. 

For small / the higher powers h?, h,- ++ in (2) are very small. Dropping all of them 
gives the crude approximation 


y(x + h) ~ yx) + hy'(x) 
= y(x) + hf, y) 


and the corresponding Euler method (or Euler-Cauchy method) 
(3) Ynt+1 = Yn + Mf(n; Yn) (n = 0, 1,---) 


discussed in Sec. 1.2. Geometrically, this is an approximation of the curve of y(x) by a 
polygon whose first side is tangent to this curve at xg (see Fig. 8 in Sec. 1.2). 


Error of the Euler Method. Recall from calculus that Taylor’s formula with 
remainder has the form 


yx + h) = yx) + Ay) + 5h" 
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(where x S € Sx + h). It shows that, in the Euler method, the truncation error in each 
step or local truncation error is proportional to h”, written O(h?), where O suggests order 
(see also Sec. 20.1). Now, over a fixed x-interval in which we want to solve an ODE, the 
number of steps is proportional to 1/h. Hence the total error or global error is proportional 
to mel /h) = h'. For this reason, the Euler method is called a first-order method. In 
addition, there are roundoff errors in this and other methods, which may affect the 
accuracy of the values y,, yo,--- more and more as n increases. 


Automatic Variable Step Size Selection in Modern Software. The idea of 
adaptive integration, as motivated and explained in Sec. 19.5, applies equally well to the 
numeric solution of ODEs. It now concerns automatically changing the step size h depending 
on the variability of y’ = f determined by 


(4*) y" = f' = fe + fyy' = fe + fof 


Accordingly, modern software automatically selects variable step sizes /,, so that the error 
of the solution will not exceed a given maximum size TOL (suggesting tolerance). Now for 
the Euler method, when the step size is h = hy, the local error at x, is about xh? ly" (En)I. 
We require that this be equal to a given tolerance TOL, 


(4) (a) Zh2ly"E)| = TOL, thus (b) hy,=,| a 
ly"En)I 


y"(x) must not be zero on the interval J: x9 S x = xy on which the solution is wanted. 
Let K be the minimum of | y"(x)| on J and assume that K > 0. Minimum | y"(x)| 
corresponds to maximum h = H = V2 TOL/K by (4). Thus, V2 TOL = HVK. We can 
insert this into (4b), obtaining by straightforward algebra 


(5) hy = OXy)H where Xn) = = : 
V lyn) 


For other methods, automatic step size selection is based on the same principle. 


Improved Euler Method. Predictor, Corrector. Euler’s method is generally much 
too inaccurate. For a large h (0.2) this is illustrated in Sec. 1.2 by the computation for 


(6) y=y+x, y0)=0. 


And for small 4 the computation becomes prohibitive; also, roundoff in so many steps 
may result in meaningless results. Clearly, methods of higher order and precision are 
obtained by taking more terms in (2) into account. But this involves an important practical 
problem. Namely, if we substitute y’ = f(x, y(x)) into (2), we have 


(2*) y(x + h) = yx) + Af + gh?f’ + ghPf! + + 

Now y in f depends on x, so that we have f’ as shown in (4*) and f”, f’” even much more 
cumbersome. The general strategy now is to avoid the computation of these derivatives 
and to replace it by computing f for one or several suitably chosen auxiliary values of 
(x, y). “Suitably” means that these values are chosen to make the order of the method as 
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high as possible (to have high accuracy). Let us discuss two such methods that are of 


practical importance, namely, the improved Euler method and the (classical) Runge-Kutta 
method. 


In each step of the improved Euler method we compute two values, first the predictor 
(7a) Yn+1 = Un ar hf (Xn, Yn); 


which is an auxiliary value, and then the new y-value, the corrector 


(7b) ee eee erro ade ren een 


Hence the improved Euler method is a predictor—corrector method: In each step we predict 
a value (7a) and then we correct it by (7b). 


In algorithmic form, using the notations ky = hf(xn, yn) in (7a) and kg = Af(xn+1, 
y% +1) in (7b), we can write this method as shown in Table 21.1. 


Table 21.1 Improved Euler Method (Heun’s Method) 


ALGORITHM EULER (f, Xo, yo, h, N) 


This algorithm computes the solution of the initial value problem y’ = f(x, y), y(vo) = yo 
at equidistant points x; = x9 + h, xg = Xo + 2h,---,xN = Xo + Nh; here f is such 
that this problem has a unique solution on the interval [x9, xy] (see Sec. 1.6). 


INPUT: Initial values x9, yo, step size h, number of steps NV 


OUTPUT: Approximation y,,,, to the solution yXy+1) atxn+1 = X09 + (n + Dh, 
wheren = 0,:-°:,N—-1 


For n = 0, 1,::-,N — 1 do: 
Xavi = Xn th 

ky = hf (ns Yn) 

ko = hf(Xn+1, Yn + K1) 
Ynt1 = Yn + aki + ke) 
OUTPUT Xn415¥n+1 


End 
Stop 
End EULER 


EXAMPLE 1_ Improved Euler Method. Comparison with Euler Method. 


Apply the improved Euler method to the initial value problem (6), choosing h = 0.2 as in Sec. 1.2. 


Solution. For the present problem we have in Table 21.1 


ky = 0.20n + Yn) 


kg = 0.2(¢n + 0.2 + yn + 0.20en + yn)) 


0.2 
Yn+1 = Yn 4 5 (2.2xy + 2.2y, + 0.2) = yy + 0.22(4y + yn) + 0.02. 
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Table 21.2 shows that our present results are much more accurate than those for Euler’s method in Table 21.1 but 
at the cost of more computations. 


Table 21.2 Improved Euler Method for (6). Errors 


Exact Values Error of Error of 
is *n Yn (4D) Improved Euler Euler 
0 0.0 0.0000 0.0000 0.0000 0.000 
1 0.2 0.0200 0.0214 0.0014 0.021 
2 0.4 0.0884 0.0918 0.0034 0.052 
3 0.6 0.2158 0.2221 0.0063 0.094 
4 0.8 0.4153 0.4255 0.0102 0.152 
5 1.0 0.7027 0.7183 0.0156 0.230 


Error of the Improved Euler Method. The local error is of order h® and the global 
error of order h?, so that the method is a second-order method. 


Setting $n = f(*n; Y¥(Xn)) and using (2*) (after (6)), we have 


(8a) yen + A) — yn) = hfn + anf + BP fn to 


Approximating the expression in the brackets in (7b) by fy + Fives and again using the 
Taylor expansion, we obtain from (7b) 
n ~ zh fe + fn+1] 
(8b) =2h[ int (in + hfn + 2h’ fn + °°] 
=hfn +20 fn + alfa t 


Ynt+1— 


(where ’ = d/dxy, etc.). Subtraction of (8b) from (8a) gives the local error 


he pn he pal, he pn 

rere. ot 

Since the number of steps over a fixed x-interval is proportional to 1/h, the global error 
is of order h3/ h= i, so that the method is of second order. 


Since the Euler method was an attractive pedagogical tool to teach the beginning of 
solving first-order ODEs numerically but had its drawbacks in terms of accuracy and could 
even produce wrong answers, we studied the improved Euler method and thereby 
introduced the idea of a predictor—corrector method. Although improved Euler is better 
than Euler, there are better methods that are used in industrial settings. Thus the practicing 
engineer has to know about the Runga—Kutta methods and its variants. 


Runge-Kutta Methods (RK Methods) 


A method of great practical importance and much greater accuracy than that of the 
improved Euler method is the classical Runge-Kutta method of fourth order, which we 
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call briefly the Runge-Kutta method.’ It is shown in Table 21.3. We see that in each 
step we first compute four auxiliary quantities kj, k, k3, ka and then the new value y,, +1. 
The method is well suited to the computer because it needs no special starting procedure, 
makes light demand on storage, and repeatedly uses the same straightforward compu- 
tational procedure. It is numerically stable. 

Note that, if f depends only on x, this method reduces to Simpson’s rule of integration 
(Sec. 19.5). Note further that ky,---,kq depend on n and generally change from step 
to step. 


Table 21.3 Classical Runge—Kutta Method of Fourth Order 


ALGORITHM RUNGE-KUTTA (f, x9, Yo, 1, N). 


This algorithm computes the solution of the initial value problem y’ = f(x, y), V(Xp) = Yo 
at equidistant points 


(9) X1,—=XoQT h, x2 = Xo t 2h,+++,xXxN = Xo + Nh; 


here f is such that this problem has a unique solution on the interval [x9, x,y] (see Sec. 1.7). 


INPUT: Function f, initial values x9, yo, step size h, number of steps N 
OUTPUT: Approximation y,,,, to the solution y(x,,.1) at Xn41 = Xo + (nt IDA, 
where n = 0,1,::-,N—- 1 
For n = 0, 1,::-,N — 1 do: 
ky = hf(®n, Yn) 
ko = hf + 5h, Yn T 3k) 
kg = hf(xn + 3h, yn + ake) 
ka = hf(ty + hy yn + kg) 
Xnt1=Xnt+h 
Yn+1 = Yn + G(ky + ky + 2kg + ka) 
OUTPUT xn +1, Yn+1 


End 
Stop 
End RUNGE-KUTTA 


1Named after the German mathematicians KARL RUNGE (Sec. 19.4) and WILHELM KUTTA (1867-1944). 
Runge [Math. Annalen 46 (1895), 167-178], the German mathematician KARL HEUN (1859-1929) [Zeitschr. 
Math. Phys. 45 (1900), 23-38], and Kutta [Zeitschr. Math. Phys. 46 (1901), 435-453] developed various similar 
methods. Theoretically, there are infinitely many fourth-order methods using four function values per step. The 
method in Table 21.3 is most popular from a practical viewpoint because of its “symmetrical” form and its 
simple coefficients. It was given by Kutta. 


906 


EXAMPLE 2 


CHAP. 21 Numerics for ODEs and PDEs 


Classical Runge—Kutta Method 


Apply the Runge-Kutta method to the initial value problem in Example 1, choosing h = 0.2, as before, and 
computing five steps. 


Solution. For the present problem we have f(x, y) = x + y. Hence 


ky = 0.2(%n + Yn), kg = 0.2(xn + 0.1 + yy, + 0.5k 4), 


kg = 0.2(x, + 0.1 + yy, + 0.5kQ), ka = 0.2(xyn + 0.2 + yy + kg). 


Table 21.4 shows the results and their errors, which are smaller by factors 10° and 10* than those for the two 
Euler methods. See also Table 21.5. We mention in passing that since the present ky,---,kq are simple, 
operations were saved by substituting k, into kg, then kg into kg, etc.; the resulting formula is shown in 
Column 4 of Table 21.4. Keep in mind that we have four function evaluations at each step. fal] 


Table 21.4 Runge-Kutta Method Applied to (4) 


” : 0.2214(x,, + yn) Exact Values (6D) 10° X Error 
ss it + 0.0214 VS — 5 — Il of y, 
0 0.0 0 0.021400 0.000000 0 
1 0.2 0.021400 0.070418 0.021403 3 
2 0.4 0.091818 0.130289 0.091825 7 
3 0.6 0.222107 0.203414 0.222119 12 
4 0.8 0.425521 0.292730 0.425541 20 
5 1.0 0.718251 0.718282 31 


Table 21.5 Comparison of the Accuracy of the Three Methods under Consideration 
in the Case of the Initial Value Problem (4), with h = 0.2 


Error 

x ie ee | Euler Improved Euler Runge-Kutta 

(Table 21.1) (Table 21.3) (Table 21.5) 
0.2 0.021403 0.021 0.0014 0.000003 
0.4 0.091825 0.052 0.0034 0.000007 
0.6 0.222119 0.094 0.0063 0.000011 
0.8 0.425541 0.152 0.0102 0.000020 
1.0 0.718282 0.230 0.0156 0.000031 


Error and Step Size Control. 
RKF (Runge—Kutta—Fehlberg) 


The idea of adaptive integration (Sec. 19.5) has analogs for Runge-Kutta (and other) 
methods. In Table 21.3 for RK (Runge-Kutta), if we compute in each step approximations 
¥and y with step sizes h and 2h, respectively, the latter has error per step equal to ? = 32 
times that of the former; however, since we have only half as many steps for 2h, the actual 
factor is 2°) 2 = 16, so that, say, 


62) ~ 16 W _ y2h — 2M _ MW ~ (16 — 1), 


and thus y y 
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Hence the error € = €™ for step size h is about 


(10) eae = 3) 

where ¥ — ¥ = yi = ya , as said before. Table 21.6 illustrates (10) for the initial value 
problem 

(11) y=Q-x-17 +2, yO) =1, 


the step size h = 0.1 and 0 =x =0.4. We see that the estimate is close to the actual 


error. This method of error estimation is simple but may be unstable. 


Table 21.6 Runge-Kutta Method Applied to the Initial Value Problem (11) 


and Error Estimate (10). Exact Solution y = tanx + x +1 


y 
(Step size h) 


(Step size 2h) 


Error 
Estimate (10) 


Actual 
Error 


Exact 
Solution (9D) 


1.000000000 
1.200334589 
1.402709878 
1.609336039 
1.822792993 


1.000000000 


1.402707408 


1.822788993 


0.000000000 


0.000000165 


0.000000267 


0.000000000 
0.000000083 
0.000000157 
0.000000210 
0.000000226 


1.000000000 
1.200334672 
1.402710036 
1.609336250 
1.822793219 


RKF._ E. Fehlberg [Computing 6 (1970), 61-71] proposed and developed error control 
by using two RK methods of different orders to go from (xy, yn) to (%n+1, Yn+1). The 
difference of the computed y-values at x,+1 gives an error estimate to be used for step 
size control. Fehlberg discovered two RK formulas that together need only six function 
evaluations per step. We present these formulas here because RKF has become quite 
popular. For instance, Maple uses it (also for systems of ODEs). 

Fehlberg’s fifth-order RK method is 


(12a) Ynt1 = Yn + yiki + ++: + Yoke 
with coefficient vector y = [y1°-: Ye], 

(12b) y=([i5 0 ies 38430 ae 
His fourth-order RK method is 

(13a) Ynt1 = Yn + Yiki + + + Y5k5 
with coefficient vector 

(13b) y*=[si6 9 3565 4104 —8).- 
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In both formulas we use only six different function evaluations altogether, namely, 


ky = hf(xn, Yn) 
ko =hfltn + ah, yn t+ aka) 
kg =hflin+$h, Yn + gykit 3yke) 


_ 12 1932 7200 7296 
ka = fn + 73h, Yn + a197k1 — at97k2 + a197k3) 


(14) 


ks =hfnth,  yot $eki-  8ko + 8Pks — orks) 


1859 


ke =hfGn+tsh, Yne—- dykit ke — Bask + 


The difference of (12) and (13) gives the error estimate 


4104 


ka 


x05). 


= 4 1 128 2197 1 2 
(5) €n41 ~ Yn+1 — Yn+1 = 360k1 — a975k3 — We aa0ka + 50Ks5 + B5ke- 


Runge—Kutta—Fehlberg 


For the initial value problem (11) we obtain from (12)-(14) with h = 0.1 in the first step the 12S-values 


ky = 0.200000000000 kg = 0.200062500000 
kg = 0.200140756867 ka = 0.200856926154 
ks = 0.201006676700 kg = 0.200250418651 


y', = 1.20033466949 
yz = 1.20033467253 


and the error estimate 


€1 ~ y1 — y¢ = 0.00000000304. 


The exact 12S-value is y(0.1) = 1.20033467209. Hence the actual error of y; is —4.4 + 107?°, smaller than that 


in Table 21.6 by a factor of 200. 


Table 21.7 summarizes essential features of the methods in this section. It can be shown 
that these methods are numerically stable (definition in Sec. 19.1). They are one-step 
methods because in each step we use the data of just one preceding step, in contrast to 
multistep methods where in each step we use data from several preceding steps, as we 


shall see in the next section. 


Table 21.7 Methods Considered and Their Order (= Their Global Error) 


Method Fubcion Evaluanon Global Error Local Error 
per Step 
Euler 1 O(h) Oth?) 
Improved Euler 2 O(h?) Oth?) 
RK (fourth order) 4 O(h*) O(h*) 
RKF 6 O(h°) O(h®) 
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EXAMPLE 4 


Backward Euler Method. Stiff ODEs 


The backward Euler formula for numerically solving (1) is 


(16) Yn+1 — Yn ar hf @n+1, Yn+1) (n = 0, 1, iat ). 


This formula is obtained by evaluating the right side at the new location (x41, Yn+1); 
this is called the backward Euler scheme. For known y,, it gives y,+1 implicitly, so it 
defines an implicit method, in contrast to the Euler method (3), which gives y,,+4 
explicitly. Hence (16) must be solved for y,,4 1. How difficult this is depends on fin (1). 
For a linear ODE this provides no problem, as Example 4 (below) illustrates. The method 
is particularly useful for “stiff? ODEs, as they occur quite frequently in the study of 
vibrations, electric circuits, chemical reactions, etc. The situation of stiffness is roughly 
as follows; for details, see, for example, [E5], [E25], [E26] in App. 1. 

Error terms of the methods considered so far involve a higher derivative. And we ask 
what happens if we let increase. Now if the error (the derivative) grows fast but the desired 
solution also grows fast, nothing will happen. However, if that solution does not grow fast, 
then with growing h the error term can take over to an extent that the numeric result becomes 
completely nonsensical, as in Fig. 451. Such an ODE for which h must thus be restricted 
to small values, and the physical system the ODE models, are called stiff. This term is 
suggested by a mass-—spring system with a stiff spring (spring with a large k; see Sec. 2.4). 
Example 4 illustrates that implicit methods remove the difficulty of increasing h in the case 
of stiffness: It can be shown that in the application of an implicit method the solution remains 
stable under any increase of h, although the accuracy decreases with increasing h. 


Backward Euler Method. Stiff ODE 


The initial value problem 


y’ = f(x, y) 20hy + 20x? + 2x, y(0) = 1 


has the solution (verify!) 
y = e208 4 x? 


The backward Euler formula (16) is 


Ynt+1 = Yn + hf&n+1 Yn+) Yn t h( 20¥n+1 t 20x 7.41 t 2xXn+1)- 


Noting that x,+1 = x, + h, taking the term —20y,,+1 to the left, and dividing, we obtain 


Yn + h{20y, + hy? + 20%, + h)| 


16* y. 
men es 1 + 20h 


The numeric results in Table 21.8 show the following. 


Stability of the backward Euler method for # = 0.05 and also for h = 0.2 with an error increase by about a 
factor 4 for h = 0.2, 


Stability of the Euler method for = 0.05 but instability for h = 0.1 (Fig. 451), 
Stability of RK for h = 0.1 but instability for h = 0.2. 


This illustrates that the ODE is stiff. Note that even in the case of stability the approximation of the solution 
near x = 0 is poor. =] 


Stiffness will be considered further in Sec. 21.3 in connection with systems of ODEs. 
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-1.0 


Fig. 451. 


Euler method with h = 0.1 for the stiff 


ODE in Example 4 and exact solution 


Table 21.8 Backward Euler Method (BEM) for Example 6. Comparison with Euler and RK 


e BEM BEM Euler Euler RK RK Hee 
h=005 h=02 h=0.05 h=0.1 h=0.1 h = 0.2 

0.0 1.00000 1.00000 1.00000 1.00000 1.00000 1.000 1.00000 
0.1 0.26188 0.00750 —1.00000 0.34500 0.14534 
0.2. 0.10484 0.24800 0.03750 1.04000 0.15333 5.093 0.05832 
0.3 0.10809 0.08750 —0.92000 0.12944 0.09248 
0.4 0.16640 0.20960 0.15750 1.16000 0.17482 25.48 0.16034 
0.5 0.25347 0.24750 —0.76000 0.25660 0.25004 
0.6 0.36274 0.37792 0.35750 1.36000 0.36387 127.0 0.36001 
0.7 0.49256 0.48750 —0.52000 0.49296 0.49001 
0.8 0.64252 0.65158 0.63750 1.64000 0.64265 634.0 0.64000 
0.9 0.81250 0.80750 —0.20000 0.81255 0.81000 
1.0 1.00250 1.01032 0.99750 2.00000 1.00252 3168 1.00000 


1-4| EULER METHOD 
Do 10 steps. Solve exactly. Compute the error. Show 
details. 
1. y' +02y=0, y0)=5, h=02 
Vi-y, y0)=0, h=01 
3. y= -x*, y0)=0, h=O01 
4.y=( +x y0)=0, h=0.1 


5-10 | IMPROVED EULER METHOD 

Do 10 steps. Solve exactly. Compute the error. Show 

details. 

5.y =y, yO=H1, h=O01 

6. y =2(1+ y?), y(0)=0, h = 0.05 

7. y' —xy?=0, y0)=1, h=0.1 

8. Logistic population model. y’ = y — y?, (0) = 0.2, 
h=0.1 


PROBLEM SET 21-1 


9. Do Prob. 7 using Euler’s method with h = 0.1 and com- 
pare the accuracy. 
10. Do Prob. 7 using the improved Euler method, 20 steps 
with h = 0.05. Compare. 


CLASSICAL RUNGE-KUTTA METHOD 

OF FOURTH ORDER 

Do 10 steps. Compare as indicated. Show details. 

11. y’ — xy? =0, y(0)=1, A=0.1. Compare with 
Prob. 7. Apply the error estimate (10) to yo. 

12. y' =y—y?, yO) =0.2, h=0.1. Compare with 
Prob. 8. 

13. y'=1+y%, y0)=0, h=01 

14. yy =(-x7 dy, yd) =1, h=01 

15. y’ + ytanx =sin2x, yO)=1, h=O.1 

16. Do Prob. 15 with h = 0.2, 5 steps, and compare the 
errors with those in Prob. 15. 
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17. y’ = 4x9y?,_ yO) = 0.5, h=0.1 
18. Kutta’s third-order method is defined by yy+1 = 


19. 


21.2 Multistep Methods 


yn + (ky + 4ko + k3) with ky and ky as in RK 
(Table 21.3) and k3 = hf(%n41,¥n — ky + 2k). 
Apply this method to (4) in (6). Choose h = 0.2 and 
do 5 steps. Compare with Table 21.5. 
CAS EXPERIMENT. Euler—Cauchy vs. RK. Con- 
sider the initial value problem 
(17) y’ = GW — 0.01x?)? sin (x?) + 0.02x, 

y(0) = 0.4 
(solution: y = 1/[2.5 — S(@)] 4 0.01x7 where S(x) is 
the Fresnel integral (38) in App. 3.1). 
(a) Solve (17) by Euler, improved Euler, and RK 
methods for0 S$ x = 5 with steph = 0.2. Compare the 
errors for x = 1, 3, 5 and comment. 


20. 
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(b) Graph solution curves of the ODE in (17) for 
various positive and negative initial values. 

(c) Do a similar experiment as in (a) for an initial 
value problem that has a monotone increasing or 
monotone decreasing solution. Compare the behavior 
of the error with that in (a). Comment. 


CAS EXPERIMENT. RKF. (a) Write a program for 
RKF that gives xp, yn, the estimate (10), and, if the 
solution is known, the actual error €,). 

(b) Apply the program to Example 3 in the text 
(10 steps, # = 0.1). 

(c) €, in (b) gives a relatively good idea of the size 
of the actual error. Is this typical or accidental? Find 
out, by experimentation with other problems, on 
what properties of the ODE or solution this might 
depend. 


In a one-step method we compute y,+ 1 using only a single step, namely, the previous 
value y,. One-step methods are “self-starting,” they need no help to get going because 
they obtain y; from the initial value yo, etc. All methods in Sec. 21.1 are one-step. 

In contrast, a multistep method uses, in each step, values from two or more previous 
steps. These methods are motivated by the expectation that the additional information will 
increase accuracy and stability. But to get started, one needs values, say, yo, y1, ye, yg in 
a 4-step method, obtained by Runge-Kutta or another accurate method. Thus, multistep 
methods are not self-starting. Such methods are obtained as follows. 


Adams-—Bashforth Methods 


We consider an initial value problem 


(1) 


y' = f(x,y), 


y(Xo) = Yo 


as before, with f such that the problem has a unique solution on some open interval 
containing x9. We integrate y’ = f(x, y) from x», to Xy41 = Xn + h. This gives 


| y'@o) dx = yrns1) — yin) = | 


wy 


Un+1 


F(x, y(x)) dx. 


Xn, 


Now comes the main idea. We replace f(x, y(x)) by an interpolation polynomial p(x) (see 
Sec. 19.3), so that we can later integrate. This gives approximations y,,.1 of y(x,4 1) and 


Yn of y(n), 


(2) 


Yn+1 — Yn a | 


Un+1 


D(x) dx. 


Xn 
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Different choices of p(x) will now produce different methods. We explain the principle 
by taking a cubic polynomial, namely, the polynomial p3(x) that at (equidistant) 


Xn, Xn-1> Xn-2) Xn-3 
has the respective values 


tn = fn: Yn) 


In-1 — f@n-1) Yn-v) 
(3) 
In-2 = f@n-~2; Yn—2) 


In-3 = f@n~3; Yn—3) 


This will lead to a practically useful formula. We can obtain p3(x) from Newton’s 
backward difference formula (18), Sec. 19.3: 


ps) =fa t Vin tar + DV + err + Dr + DVA, 


where 


X= By 


We integrate p3(x) over x from xy to Xn+1 = Xy + h, thus over r from 0 to 1. Since 
Xx =Xy + hr, we have dx = hdr. 

The integral of $r(r + 1) is 7 and that of gr(r + I(r + 2) is 3. We thus obtain 

a : 1 5 3 
(4) p3 dx =h podr= H(i. + 3p +E Ve + 20%). 

' 2 12 8 
It is practical to replace these differences by their expressions in terms of f: 
Vin = fn — fn-1 


ve n = In ~ 2fn-1 + fn-2 
v3 n = fn = 3fn-1 + 3fn—2 — fn-3- 


We substitute this into (4) and collect terms. This gives the multistep formula of the 
Adams-Bashforth method? of fourth order 


h 
(5) Ynt+1 = Yn + 94 On = Wip=a ap 37fn—2 a fn —3): 
?Named after JOHN COUCH ADAMS (1819-1892), English astronomer and mathematician, one of the 


predictors of the existence of the planet Neptune (using mathematical calculations), director of the Cambridge 
Observatory; and FRANCIS BASHFORTH (1819-1912), English mathematician. 
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It expresses the new value y,,1 [approximation of the solution y of (1) at x, ] in terms 
of 4 values of f computed from the y-values obtained in the preceding 4 steps. The local 
truncation error is of order bh, as can be shown, so that the global error is of order hi; 
hence (5) does define a fourth-order method. 


Adams—Moulton Methods 


Adams—Moulton methods are obtained if for p(x) in (2) we choose a polynomial that 
interpolates f(x, y(x)) at Xn+1, Xn, Xn-1,°'* (as Opposed to xy, X¥,—1,°°* used before; this 
is the main point). We explain the principle for the cubic polynomial ps (x) that interpolates 
at Xn+1,Xn»Xn-1,Xn—2. (Before we had xy, X,n~1, Xn—2, Xn—3-) Again using (18) in 
Sec. 19.3 but now setting r = (x — x4 )/h, we have 


Ps) = fosr t Viper torr t+ DV far + Gr + D+ DV hpsr 


We now integrate over x from x, to X,+41 as before. This corresponds to integrating over 
r from —1 to 0. We obtain 


x. 
es 1 1 1 
| P3(x) dx = (Fuss — 5 Vnvi — re atl Vet) 


Replacing the differences as before gives 
Un+1 } 
ss h 
(6) Ynt+1 = Yn + | p3(x) dx = Yn + 934 Ofn+1 + 19f, — Sfr-1 + fn-2)- 


This is usually called an Adams—Moulton formula.? It is an implicit formula because 
Inti =f(Xn+1, Yn+1) appears on the right, so that it defines y,,41 only implicitly, in 
contrast to (5), which is an explicit formula, not involving y,,,; on the right. To use (6) 
we must predict a value y,;,1, for instance, by using (5), that is, 


h 
(7a) Ynt1 = Yn + 94 On — S9%n-1 + 31fn—2 — Mn—s)- 


The corrected new value y,+ is then obtained from (6) with f,+1 replaced by 
Siar = f(Xn+1, Yi+1) and the other f’s as in (6); thus, 


h 
(7b) Yn+1 — Yn ap 4 (Ce oP 19f, = Sn=1 Titry2)> 


This predictor—-corrector method (7a), (7b) is usually called the Adams—Moulton 
method of fourth order. It has the advantage over RK that (7) gives the error estimate 


1 Fe 
Ensi ~ 15(n+1 — Ynt1)s 


as can be shown. This is the analog of (10) in Sec. 21.1. 


3FOREST RAY MOULTON (1872-1952), American astronomer at the University of Chicago. For ADAMS 
see footnote 2. 
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Sometimes the name Adams—Moulton method is reserved for the method with several 
corrections per step by (7b) until a specific accuracy is reached. Popular codes exist for 
both versions of the method. 


Getting Started. In (5) we need fo, f1, fo, fg. Hence from (3) we see that we must first 
compute y,, yo, y3 by some other method of comparable accuracy, for instance, by RK or 
by RKF. For other choices see Ref. [E26] listed in App. 1. 


Adams-Bashforth Prediction (7a), Adams—Moulton Correction (7b) 


Solve the initial value problem 
(8) y=xt+y, 0)=0 


by (7a), (7b) on the interval 0 S x S 2, choosing h = 0.2. 


Solution. The problem is the same as in Examples 1 and 2, Sec. 21.1, so that we can compare the results. 
We compute starting values y1, yo, yg by the classical Runge-Kutta method. Then in each step we predict 
by (7a) and make one correction by (7b) before we execute the next step. The results are shown and compared 
with the exact values in Table 21.9. We see that the corrections improve the accuracy considerably. This is 
typical. | 


Table 21.9 Adams—Moulton Method Applied to the Initial Value Problem (8); 
Predicted Values Computed by (7a) and Corrected Values by (7b) 


Starting Predicted Corrected Exact 10° - Error 
% ce ve yx Wes Values of yp, 
0 0.0 0.000000 0.000000 0 
1 0.2 0.021400 0.021403 3 
2 0.4 0.091818 0.091825 i 
3 0.6 0.222107 0.222119 12 
4 0.8 0.425361 0.425529 0.425541 12 
5 1.0 0.718066 0.718270 0.718282 12 
6 1.2 1.119855 1.120106 1.120117 11 
7 1.4 1.654885 1.655191 1.655200 9 
8 1.6 2.352653 2.353026 2.353032 6 
9 1.8 3.249190 3.249646 3.249647 1 
10 2.0 4.388505 4.389062 4.389056 —6 


Comments on Comparison of Methods. An Adams—Moulton formula is generally 
much more accurate than an Adams—Bashforth formula of the same order. This justifies 
the greater complication and expense in using the former. The method (7a), (7b) is 
numerically stable, whereas the exclusive use of (7a) might cause instability. Step size 
control is relatively simple. If |Corrector — Predictor] > TOL, use interpolation to 
generate “old” results at half the current step size and then try 4/2 as the new step. 

Whereas the Adams—Moulton formula (7a), (7b) needs only 2 evaluations per step, 
Runge-Kutta needs 4; however, with Runge-Kutta one may be able to take a step size 
more than twice as large, so that a comparison of this kind (widespread in the literature) 
is meaningless. 

For more details, see Refs. [E25], [E26] listed in App. 1. 
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PROBLEM SET 217.2 


1-10 


ADAMS—MOULTON METHOD 


Solve the initial value problem by Adams—Moulton (7a), (7b), 
10 steps with 1 correction per step. Solve exactly and compute 
the error. Use RK where no starting values are given. 


1, 


.y =2xy, y(0) = 1, 
.y =14+y7, yO) =0, h=0.1, (0.100335, 


y =y, yO) =1, 
1.349858) 


h=0.1, (1.105171, 1.221403, 
h=0.1 


0.202710, 0.309336) 


4. Do Prob. 2 by RK, 5 steps, h = 0.2. Compare the errors. 


21.3 Methods for Systems 


. Do Prob. 3 by RK, 5 steps, h = 0.2. Compare the errors. 
~y =(y-x-1%+2, yO=1, A=0.1, 


10 steps 


.y' =3y— 12y, 0) =0.2, h=01 
.y =1-4y, yO) =0, A=0.1 

. yy = 3x2(1 + y), yO) =0, A = 0.05 
.y =x/y, 
- Do and show the calculations leading to (4)-(7) in the 


| 


yd) = 3, h=0.2 


text. 


. Quadratic polynomial. Apply the method in the text 


to a polynomial of second degree. Show that this leads 
to the predictor and corrector formulas 


h 
yn = Yn + 55 24m — 16fn-1 + Sfn-2), 


h 
Yn+1 = Yn 12 (Sfr+1 + 8fn — Sn-1)- 


13. 


14. 


15. 


Using Prob. 12, solve y’ = 2xy, (0) = 1 (10 steps, 
h = 0.1, RK starting values). Compare with the exact 
solution and comment. 


How much can you reduce the error in Prob. 13 by 
halfing h (20 steps, h = 0.05)? First guess, then 
compute. 


CAS PROJECT. Adams—Moulton. (a) Accurate 
starting is important in (7a), (7b). Illustrate this in 
Example 1 of the text by using starting values from 
the improved Euler—Cauchy method and compare the 
results with those in Table 21.8. 


(b) How much does the error in Prob. 11 decrease 
if you use exact starting values (instead of RK 
values)? 


(c) Experiment to find out for what ODEs poor 
starting is very damaging and for what ODEs it 
is not. 


(d) The classical RK method often gives the same 
accuracy with step 2h as Adams—Moulton with step 
h, so that the total number of function evaluations is 
the same in both cases. Illustrate this with Prob. 8. 
(Hence corresponding comparisons in the literature 
in favor of Adams—Moulton are not valid. See also 
Probs. 6 and 7.) 


and Higher Order ODEs 


Initial value problems for first-order systems of ODEs are of the form 


(1) 


in components 


yi = fi, y1.°**> Ym); 


yo = folX, y1.°°*s Ym) 


Vin = fm, Y1,°°*s Ym)- 


y’ = f(x, y), 


y(%o) = Yo. 


yi%o) = Y10 


ya(xo) = Yeo 


Ym(Xo0) = Ymo- 
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Here, f is assumed to be such that the problem has a unique solution y(x) on some open 
x-interval containing x9. Our discussion will be independent of Chap. 4 on systems. 

Before explaining solution methods it is important to note that (1) includes initial value 
problems for single mth-order ODEs, 


(2) SSG a 


and initial conditions y(x9) = Ky, y'(xo) = Ko,::-, ge aa = Ky» as special cases. 
Indeed, the connection is achieved by setting 


(3) MW=y ye=y, ys=yy ct, Ym =". 


Then we obtain the system 


Jal > ye 
y2 = 38 
(4) 
, 
Ym-1 — Ym 
Yn = £0 V1," Ym) 
and the initial conditions yy(%9) = Ky, yo(%o) = Ke, °°, Ym(Xo) = Km. 


Euler Method for Systems 


Methods for single first-order ODEs can be extended to systems (1) simply by writing vector 

functions y and f instead of scalar functions y and f, whereas x remains a scalar variable. 
We begin with the Euler method. Just as for a single ODE, this method will not be 

accurate enough for practical purposes, but it nicely illustrates the extension principle. 


Euler Method for a Second-Order ODE. Mass—Spring System 
Solve the initial value problem for a damped mass-spring system 
y" + 2y' +0.75y=0, y0)=3, y(0)=-25 


by the Euler method for systems with step h = 0.2 for x from 0 to 1 (where x is time). 


Solution. The Euler method (3), Sec. 21.1, generalizes to systems in the form 


(5) Yn+1 = Yn aR hin, Yn), 


in components 
Yin+1 = Yin + AfiGns Yim Yan) 
Yan+1 = Yan + AfeO&ns Yin» Y2,n) 
and similarly for systems of more than two equations. By (4) the given ODE converts to the system 


yt =A yi ye) = ye 


y2 = folx, 1, Y2) = —2y2 — 0.75y}. 
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Hence (5) becomes 


Yint1 = Yin + 0.2yan 


Y2n+ 1 Y2n 0.2( 2y2.n 0.75y1,n)- 


The initial conditions are y(0) = y,(0) = 3, y'(0) = yo(0) 2.5. The calculations are shown in Table 21.10. 
As for single ODEs, the results would not be accurate enough for practical purposes. The example merely serves 
to illustrate the method because the problem can be readily solved exactly, 


y = yy = 2e705* 4 ee thus oy! = yp = —e 9 5® — 1 5e71 57, im 


Table 21.10 Euler Method for Systems in Example 1 (Mass—Spring System) 


rae a y, Exact Error i Yo Exact Error 
i me @D) Swi = the ue (5D) €2 = Yo — Yan 

0 0.0 3.00000 3.00000 0.00000 —2.50000 —2.50000 0.00000 
1 0.2 2.50000 2.55049 0.05049 —1.95000 —2.01606 —0.06606 
2 04 2.11000 2.18627 0.76270 —1.54500 —1.64195 —0.09695 
3 0.6 1.80100 1.88821 0.08721 —1.24350 —1.35067 —0.10717 
4 0.8 1.55230 1.64183 0.08953 —1.01625  —1.12211 —0.10586 
5 1.0 1.34905 1.43619 0.08714 —0.84260 —0.94123 —0.09863 


Runge-Kutta Methods for Systems 


As for Euler methods, we obtain RK methods for an initial value problem (1) simply by 
writing vector formulas for vectors with m components, which, for m = 1, reduce to the 
previous scalar formulas. 

Thus, for the classical RK method of fourth order in Table 21.3, we obtain 


(6a) y(x0) = Yo (Initial values) 
and for each step n = 0, 1,-:-, N — 1 we obtain the 4 auxiliary quantities 


ky = hf(n, Yn) 


ky = hf, + 5h, Yn + oki) 
(6b) A ? 
kg = hf(xn + 5h, Yn + ako) 


kg =hfG@y,+h, Yn t+ ks) 
and the new value [approximation of the solution y(x) at xn+41 = x9 + (a + Dh] 
(6c) Ynt1 = Yn + G (ky + Wk + 2kg + ka). 


RK Method for Systems. Airy’s Equation. Airy Function Ai(x) 


Solve the initial value problem 


y" =xy, yO) = 1/873 - TQ) = 0.35502805, —-y'(0) = —1/(31/3 - P@)) = —0.25881940 
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by the Runge-Kutta method for systems with h = 0.2; do 5 steps. This is Airy’s equation,* which arose in 
optics (see Ref. [A13], p. 188, listed in App. 1). P is the gamma function (see App. A3.1). The initial conditions 
are such that we obtain a standard solution, the Airy function Ai(x), a special function that has been thoroughly 
investigated; for numeric values, see Ref. [GenRef1], pp. 446, 475. 


Solution. For y" = xy, setting y; = y, yo = yy = y’ we obtain the system (4) 
ty 
yi = y2 
fe sok 
ya, — XY1- 


Hence f = [f, _fa]' in (1) has the components fy (x, y) = yo, fo(%, y) = xy1. We now write (6) in components. 
The initial conditions (6a) are yj,9 = 0.35502805, yo9 = —0.25881940. In (6b) we have fewer subscripts by 
simply writing ky = a, kg = b, kg = ce, kg = d, so that a = [ay ag)", etc. Then (6b) takes the form 


lyon 
a=h 
L*nV1n 
| yan a 22 
b=h 1 ‘ 
L (Xn shin 544) 


(6b*) 
1 


| yon “2 
c=h ; i 
[Qn shin xb) 


Yan T C2 


L@n Se A)Yin + cy) 


For example, the second component of b is obtained as follows. f(x, y) has the second component fa(x, y) = xy1. 
Now in b & kg) the first argument is 


The second argument in b is 
Y= Yn + 3a, 
and the first component of this is 
V1 = Yan + 3a. 
Together, 
xy1 = (tn + 2A)1n + 241)- 
Similarly for the other components in (6b*). Finally, 


(6c*) Yntl = Yn + &(a + 2b + 2c + d). 


Table 21.11 shows the values y(x) = y, (x) of the Airy function Ai(x) and of its derivative y'(x) = yo(x) as well 
as of the (rather small!) error of y(x). 


4Named after Sir GEORGE BIDELL AIRY (1801-1892), English mathematician, who is known for his work 
in elasticity and in PDEs. 
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Table 21.11 RK Method for Systems: Values y, ,,(x,) of the Airy Function Ai(x) 
in Example 2 


n Xe YinXn) y1(x,) Exact (8D) 10° + Error of y, yon) 

0 0.0 0.35502805 0.35502805 0 —0.25881940 
1 0.2 0.30370303 0.30370315 12 —0.25240464 
2 0.4 0.25474211 0.25474235 24 —0.23583073 
3 0.6 0.20979973 0.20980006 33 —0.21279185 
4 0.8 0.16984596 0.16984632 36 —0.18641171 
5 1.0 0.13529207 0.13529242 35 —0.15914687 


Runge—Kutta—Nystr6m Methods (RKN Methods) 


RKN methods are direct extensions of RK methods (Runge-Kutta methods) to second-order 
ODEs y” = f(x, y, y’), as given by the Finnish mathematician E. J. Nystrém [Acta Soc. Sci. 
fenn., 1925, L, No. 13]. The best known of these uses the following formulas, where 
n = 0,1,-:-,N— 1 (N the number of steps): 

ky = hf ns Yns Yn) 

ko = 3hflin + 3h, yn + Kyym + ky) where K = 3h(yn + 3k) 

kg = ghfin + hyn + Kyyn + ke) 

kg = xhf (xn + hyyn +L, yn + 2kg) where L = h(y,, + kg). 


(7a) 


From this we compute the approximation y,+1 of y(xn+1) at Xn41 = X09 + (n+ DA, 
(7b) Ynv1 =Yn t+ hOn + 3(ki + ke + ks), 
and the approximation eel of the derivative y Wie 1) needed in the next step, 
(7c) Ynti = Yn + 3(ky + 2ky + 2kg + ka). 
RKN for ODEs y” = f(x, y) Not Containing y’. Then ky = kg in (7), which makes 
the method particularly advantageous and reduces (7a)—(7c) to 
ky = 5hf&n, Yn) 
ka = ghfn + 2h, yn + 3hOn + 3ky)) = ks 
(7*) ka = ghf(tn + Wyn + hOn + ke)) 
Yn+1 = Yn t (yn + 3(ki + 2ko)) 
Ynta = Yn + 3(ky + 4ky + ka). 


RKN Method. Airy’s Equation. Airy Function Ai(x) 


For the problem in Example 2 and h = 0.2 as before we obtain from (7*) simply ky = 0.1x,yy, and 


kg = kg = O0.1(%y + 0.1)0y + 0.1y;, + 0.05k1), kg = 0.1m + 0.2)(y + 0.2y7, + 0.2k2). 


Table 21.12 shows the results. The accuracy is the same as in Example 2, but the work was much less. sy] 
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Table 21.12 Runge—Kutta—Nystrém Method Applied to Airy’s Equation, 
Computation of the Airy Function y = Ai(x) 


8. 
Xe Vn a. y(x) Exact (8D) A of a 
0.0 0.35502805 —0.25881940 0.35502805 0 
0.2 0.30370304 —0.25240464 0.303703 15 11 
0.4 0.25474211 —0.23583070 0.25474235 24 
0.6 0.20979974 —0.21279172 0.20980006 32 
0.8 0.16984599 —0.18641134 0.16984632 33 
1.0 0.13529218 —0.15914609 0.13529242 24 


Our work in Examples 2 and 3 also illustrates that usefulness of methods for ODEs in the 
computation of values of “higher transcendental functions.” 


Backward Euler Method for Systems. Stiff Systems 


The backward Euler formula (16) in Sec. 21.1 generalizes to systems in the form 


(8) Yn+1 = Yn + hf(Xni1; Yn+1) (n = 0, 1,*>*). 


This is again an implicit method, giving y,,,1 implicitly for given y,,. Hence (8) must be 
solved for y,,,1. For a linear system this is shown in the next example. This example also 
illustrates that, similar to the case of a single ODE in Sec. 21.1, the method is very useful 
for stiff systems. These are systems of ODEs whose matrix has eigenvalues A of very 
different magnitudes, having the effect that, just as in Sec. 21.1, the step in direct methods, 
RK for example, cannot be increased beyond a certain threshold without losing stability. 
(A = —1 and —10 in Example 4, but larger differences do occur in applications.) 


Backward Euler Method for Systems of ODEs. Stiff Systems 


Compare the backward Euler method (8) with the Euler and the RK methods for numerically solving the initial 
value problem 


y” + 1ly’ + 10y = 10x + 11, y(0) =2, yO) = —-10 


converted to a system of first-order ODEs. 


Solution. The given problem can easily be solved, obtaining 


y=e + e lO + x 


so that we can compute errors. Conversion to a system by setting y = y1, y’ = ye [see (4)] gives 
yi = Ye yO) = 2 
yg 10y, — Illy. + 10x + 11 yo(0) = —10. 


The coefficient matrix 


0 1 —A 1 
A= 


-10 —-11 -10 -A-1I11 


| has the characteristic determinant | 


whose value is A2 + 11A + 10 (A + 1)(A + 10). Hence the eigenvalues are —1 and —10 as claimed above. 
The backward Euler formula is 
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Yn+1 = | 


Yin+1 Yin 
Y2n+1 Y2.n 


Y2n+1 


—10yin+41 — Uyen+1 + 10%n41 + 11 


Reordering terms gives the linear system in the unknowns yj, +1 and yon+1 


The coefficient determinant is D = 1 4 


Yn+1 = 


Yiyn+1 


s 
D 


(+ 1h)yin 


10hyyn+1 + + 11A)yansi 


hyan+1 = Vin 


Yan + 1OA(%n + h) 


11h? 4 


10h? 


+ yon + 10h? x, 


11h + 10h? 


10hy1.n 


+ Yon+ 10hxy 


11h. 
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+ 11h + 10h, and Cramer’s rule (in Sec. 7.6) gives the solution 


Table 21.13 Backward Euler Method (BEM) for Example 4. Comparison with Euler and RK 


BEM BEM Euler Euler RK RK Heat 
5 h = 0.2 h=0.4 h=0.1 h = 0.2 h = 0.2 h = 0.3 
0.0 2.00000 2.00000 2.00000 2.00000 2.00000 2.00000 2.00000 
0.2 1.36667 1.01000 0.00000 1.35207 1.15407 
0.4 1.20556 1.31429 1.56100 2.04000 1.18144 1.08864 
0.6 1.21574 1.13144 0.11200 1.18585 3.03947 1.15129 
0.8 1.29460 1.35020 1.23047 2.20960 1.26168 1.24966 
1.0 1.40599 1.34868 0.32768 1.37200 1.36792 
1:2 1.53627 1.57243 1.48243 2.46214 1.50257 5.07569 1.50120 
1.4 1.67954 1.62877 0.60972 1.64706 1.64660 
1.6 1.83272 1.86191 1.78530 2.76777 1.80205 1.80190 
1.8 1.99386 1.95009 0.93422 1.96535 8.72329 1.96530 
2.0 2.16152 2.18625 2.12158 3.10737 2.13536 2.13534 


Table 21.13 shows the following. 


Stability of the backward Euler method for h = 0.2 and 0.4 (and in fact for any h; try h = 5.0) with decreasing 
accuracy for increasing h 


Stability of the Euler method for h = 0.1 but instability for h = 0.2 
Stability of RK for h = 0.2 but instability for h = 0.3 


Figure 452 shows the Euler method for h = 0.18, an interesting case with initial jumping (for about x > 3) but 
later monotone following the solution curve of y = y,. See also CAS Experiment 15. B 


3.0 


2.0 


1.0 


) 


Fig. 452. Euler method with h = 0.18 in Example 4 
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PROBLEM SET 21-3 


1-6 


EULER FOR SYSTEMS AND 


SECOND-ORDER ODEs 


Solve by the Euler’s method. Graph the solution in the 
y1y2-plane. Calculate the errors. 


aI 


. Spiral. yy = 


y1 = 2y1 — 4y2, y2 = Y1 — 3ya, y1(0) = 3, 
yo(0) = 0, h=0.1, 10 steps 


yi + yas Y2 = “V1 — Yes 
yo(0) = 4, h=0.2, 5 steps 


y1(0) = 0, 


3.y" +ky=0, yO)=1, yO) =0, h=02, 
5 steps 

4. yi = —3y, t+ ya, yo =y1 — 3ya, yi(0) = 2, 
yo(0) = 0, h=0.1, 5 steps 

5.y"-y=x, yO=1, yO) =-2, h=0.1, 
5 steps 

6. yi =Y1, Y2= Ye, yiO)=2, yo(0) = 2, 
h=0.1, 10 steps 

7-10| RK FOR SYSTEMS 


Solve by the classical RK. 


7. 


12. 


. Pendulum 


The ODE in Prob. 5. By what factor did the error 
decrease? 


. The system in Prob. 2 
. The system in Prob. 1 
. The system in Prob. 4 


equation y”" + siny=0, y(7) =0, 
y'(7) = 1, asasystem, h=0.2, 20 steps. How 
does your result fit into Fig. 93 in Sec. 4.5? 

Bessel Function Jo. xy” + y’ +xy =0, yl) = 
0.765198, y'(1) = —0.440051, h=0.5, 5 steps. 
(This gives the standard solution Jg(x) in Fig. 110 in 
Sec. 5.4.) 


13. 


14. 


15. 


Verify the formulas and calculations for the Airy 
equation in Example 2 of the text. 


RKN. The classical RK for a first-order ODE extends 
to second-order ODEs (E. J. Nystrém, Acta fenn. 
No 13, 1925). If the ODE is y” = f(x,y), not 
containing y’, then 


ky = 3hf&n Yn) 

uJ re! 1 ry 1 = 
ka = ghflen + 3h, yn + gh(Yn + 2k1)) = kg 
ka = ghf(tn + hiyn + hyn + ke) 
h(yn + 3(k1 + 2k) 
(ky + 4ky + ka). 


Yn+1 = Yn 


’ ee; 
Yn+1 ~ Yn 


Apply this RKN (Runge—Kutta—Nystrom) method to 
the Airy ODE in Example 2 with h = 0.2 as before, to 
obtain approximate values of Ai(x). 

CAS EXPERIMENT. Backward Euler 
Stiffness. Extend Example 3 as follows. 

(a) Verify the values in Table 21.13 and show them 
graphically as in Fig. 452. 


and 


(b) Compute and graph Euler values for h near the 
“critical” h = 0.18 to determine more exactly when 
instability starts. 


(c) Compute and graph RK values for values of h 
between 0.2 and 0.3 to find A for which the RK 
approximation begins to increase away from the exact 
solution. 


(d) Compute and graph backward Euler values for 


large h; confirm stability and investigate the error 
increase for growing h. 


21.4 Methods for Elliptic PDEs 


We have arrived at the second half of this chapter, which is devoted to numerics for 
partial differential equations (PDEs). As we have seen in Chap.12, there are many 
applications to PDEs, such as in dynamics, elasticity, heat transfer, electromagnetic 
theory, quantum mechanics, and others. Selected because of their importance in 
applications, the PDEs covered here include the Laplace equation, the Poisson equation, 
the heat equation, and the wave equation. By covering these equations based on their 
importance in applications we also selected equations that are important for theoretical 
considerations. Indeed, these equations serve as models for elliptic, parabolic, and 
hyperbolic PDEs. For example, the Laplace equation is a representative example of an 


elliptic type of PDE, and so forth. 
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Recall, from Sec. 12.4, that a PDE is called quasilinear if it is linear in the highest 
derivatives. Hence a second-order quasilinear PDE in two independent variables x, y is of the 
form 


(1) Myx + 2bugy + Cuyy = F(X, y, Uy Ug, Uy). 


u is an unknown function of x and y (a solution sought). F is a given function of the 
indicated variables. 
Depending on the discriminant ac — b?, the PDE (1) is said to be of 


elliptic type if ac—b?>0 (example: Laplace equation) 
parabolic type if ac — b2=0 (example: heat equation) 


hyperbolic type if ac —b? <0 (example: wave equation). 


Here, in the heat and wave equations, y is time f. The coefficients a, b, c may be functions 
of x, y, so that the type of (1) may be different in different regions of the xy-plane. This 
classification is not merely a formal matter but is of great practical importance because 
the general behavior of solutions differs from type to type and so do the additional 
conditions (boundary and initial conditions) that must be taken into account. 

Applications involving elliptic equations usually lead to boundary value problems in a 
region R, called a first boundary value problem or Dirichlet problem if u is prescribed 
on the boundary curve C of R, a second boundary value problem or Neumann problem 
if uj, = du/dn (normal derivative of u) is prescribed on C, and a third or mixed problem 
if u is prescribed on a part of C and u,, on the remaining part. C usually is a closed curve 
(or sometimes consists of two or more such curves). 


Difference Equations 
for the Laplace and Poisson Equations 


In this section we develop numeric methods for the two most important elliptic PDEs that 
appear in applications. The two PDEs are the Laplace equation 


(2) Vo = eh Uy = 0 


and the Poisson equation 


(3) V7 = tre + ty = FO, 9). 


The starting point for developing our numeric methods is the idea that we can replace 
the partial derivatives of these PDEs by corresponding difference quotients. Details are 
as follows: 

To develop this idea, we start with the Taylor formula and obtain 


(a) u(x + hyy) = ula, y) + hugs, y) + SAUX, y) + EMPugen(x y) to 


(4) 
(b) u(x — hy y) = u(x, y) — hug (x, y) + GA ual, Y) — GIP uel y) + 00+ 
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We subtract (4b) from (4a), neglect terms in h? hi, --+, and solve for u,,. Then 
1 

(5a) U(X, y) = Th [u(x + h,y) — u(x — h, y)]. 

Similarly, 


u(x, y + k) = u(x, y) + kuy(x, y) + Tk Uy y(X, y) +e: 


and 

u(x, y — k) = u(x, y) — kuy(x, y) + ak Uy y(X, Vy) ale. ssh 
By subtracting, neglecting terms in k3 4, +++, and solving for uy we obtain 
(5b) U(X, y) * x [u(x, y + k) — u(x, y — k)]. 


We now turn to second derivatives. Adding (4a) and (4b) and neglecting terms in 
ht h?,--+, we obtain u(x + h, y) + u(x — h, y) ~ 2u(x, y) + A?uze(x, y). Solving for uzy 
we have 


1 

(6a) Ura(X, Y) * fe [u(x + h, y) — 2u(x, y) + u(x — h, y)). 
h 

Similarly, 
1 

(6b) Uyy(X, Y) © 2 [u(x, y + k) — 2u(x, y) + u(x, y — k)]. 


We shall not need (see Prob. 1) 


1 
(6c) Uxy(x, Y) = —_ [u(x +h,y +k) —- ux-h,y +k) 
4hk 
—uxt+hy—k+ua-h,y — bk). 
Figure 453a shows the points (x + h, y), (x — h, y),-++ in (5) and (6). 


We now substitute (6a) and (6b) into the Poisson equation (3), choosing k = h to obtain 
a simple formula: 


(7) HGS Se 139) ae C59 ae 1D) a Ge — 1) ar es Wy — 18) — Gs 9) h°f(x, y). 


This is a difference equation corresponding to (3). Hence for the Laplace equation (2) 
the corresponding difference equation is 


(8) u(x + h, y) + u(x, y + h) + u(x — h, y) + u(x, y — h) — 4uQ, y) = 0. 


his called the mesh size. Equation (8) relates u at (x, y) to uv at the four neighboring points 
shown in Fig. 453b. It has a remarkable interpretation: u at (x, y) equals the mean of the 
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values of u at the four neighboring points. This is an analog of the mean value property 
of harmonic functions (Sec. 18.6). 

Those neighbors are often called FE (East), N (North), W (West), S (South). Then Fig. 453b 
becomes Fig. 453c and (7) is 


(7*) u(E) + u(N) + u(W) + u(S) — 4u(x, y) = hf (x, y). 
(x,y +h) N 
(x, y +k) x x 
x 
h h 


x (xt+h,y) (x-h,y) & 


h h 
x 
(x, y-k) x x 
(x, yh) Ss 
(a) Points in (5) and (6) (b) Points in (7) and (8) (c) Notation in (7*) 


Fig. 453. Points and notation in (5)—(8) and (7*) 


Our approximation of h’V7u in (7) and (8) is a 5-point approximation with the 
coefficient scheme or stencil (also called pattern, molecule, or star) 


1 1 
(9) 1 —4 17. We may now write (7) as 1 —-4 lpu= hf (x, y). 
1 1 


Dirichlet Problem 


In numerics for the Dirichlet problem in a region R we choose an / and introduce a square 
grid of horizontal and vertical straight lines of distance h. Their intersections are called 
mesh points (or lattice points or nodes). See Fig. 454. 

Then we approximate the given PDE by a difference equation [(8) for the Laplace 
equation], which relates the unknown values of u at the mesh points in R to each other 
and to the given boundary values (details in Example 1). This gives a linear system of 
algebraic equations. By solving it we get approximations of the unknown values of u at 
the mesh points in R. 

We shall see that the number of equations equals the number of unknowns. Now comes 
an important point. If the number of internal mesh points, call it p, is small, say, p < 100, 
then a direct solution method may be applied to that linear system of p < 100 equations 
in p unknowns. However, if p is large, a storage problem will arise. Now since each 
unknown u is related to only 4 of its neighbors, the coefficient matrix of the system is a 
sparse matrix, that is, a matrix with relatively few nonzero entries (for instance, 500 of 
10,000 when p = 100). Hence for large p we may avoid storage difficulties by using an 
iteration method, notably the Gauss—Seidel method (Sec. 20.3), which in PDEs is also 
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called Liebmann’s method (note the strict diagonal dominance). Remember that in this 
method we have the storage convenience that we can overwrite any solution component 
(value of u) as soon as a “new” value is available. 

Both cases, large p and small p, are of interest to the engineer, large p if a fine grid is 
used to achieve high accuracy, and small p if the boundary values are known only rather 
inaccurately, so that a coarse grid will do it because in this case it would be meaningless 
to try for great accuracy in the interior of the region R. 

We illustrate this approach with an example, keeping the number of equations small, 
for simplicity. As convenient notations for mesh points and corresponding values of the 
solution (and of approximate solutions) we use (see also Fig. 454) 


(10) Py = (th, jh), uy = uih, jh). 


0 5h x 


Fig. 454. Region in the xy-plane covered by a grid of mesh h, 
also showing mesh points P;, = (h, h),---, P, = (ih, jh),--- 


With this notation we can write (8) for any mesh point P; in the form 
(11) PES ae Waal tm Cine) ae geen oe oli > OE 


Remark. Our current discussion and the example that follows illustrate what we may 
call the reuseability of mathematical ideas and methods. Recall that we applied the 
Gauss-Seidel method to a system of ODEs in Sec. 20.3 and that we can now apply it 
again to elliptic PDEs. This shows that engineering mathematics has a structure and 
important mathematical ideas and methods will appear again and again in different 
situations. The student should find this attractive in that previous knowledge can be 
reapplied. 


Laplace Equation. Liebmann’s Method 


The four sides of a square plate of side 12 cm, made of homogeneous material, are kept at constant temperature 
0°C and 100°C as shown in Fig. 455a. Using a (very wide) grid of mesh 4 cm and applying Liebmann’s method 
(that is, Gauss-Seidel iteration), find the (steady-state) temperature at the mesh points. 


Solution. 1n the case of independence of time, the heat equation (see Sec. 10.8) 
“32 
Ut = C (Ugg + Uyy) 


reduces to the Laplace equation. Hence our problem is a Dirichlet problem for the latter. We choose the grid 
shown in Fig. 455b and consider the mesh points in the order P11, Po1, Pig, Pog. We use (11) and, in each equation, 
take to the right all the terms resulting from the given boundary values. Then we obtain the system 
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—4uy, + ua + Uy = —200 
et 2 4us1 T Uo = —200 
(12) 
Uy a 4uy2 Tr Ugg = —100 
Ug1 + Uy — 4u29 = —100. 


In practice, one would solve such a small system by the Gauss elimination, finding uy, = ug, = 87.5, 
ujg = Ua = 62.5. 

More exact values (exact to 3S) of the solution of the actual problem [as opposed to its model (12)] are 88.1 
and 61.9, respectively. (These were obtained by using Fourier series.) Hence the error is about 1%, which is 
surprisingly accurate for a grid of such a large mesh size h. If the system of equations were large, one would 
solve it by an indirect method, such as Liebmann’s method. For (12) this is as follows. We write (12) in the 
form (divide by —4 and take terms to the right) 


uy, = 0.25u91 + 0.25uy42 + 50 
Ugy = 0.2541 0.25ug2 + 50 
Uyg = 0.2541 0.25ug9 + 25 
uog = 0.25u91 + 0.25u42 29s 


These equations are now used for the Gauss-Seidel iteration. They are identical with (2) in Sec. 20.3, where 
Uyy = X41, Ug1 = XQ, Uy2 = X3, Ug = Xa, and the iteration is explained there, with 100, 100, 100, 100 chosen as 
starting values. Some work can be saved by better starting values, usually by taking the average of the boundary 
values that enter into the linear system. The exact solution of the system is u4, = Ug, = 87.5, Uy2 = Ugg = 62.5, 
as you may verify. 


y u=0 u=0 
12 
u = 100 
u = 100 u = 100 
) 12 « 
u = 100 u = 100 
(a) Given problem (b) Grid and mesh points 


Fig. 455. Example 1 


Remark. It is interesting to note that, if we choose mesh h = L/n (L = side of R) and consider the (n — 12 
internal mesh points (i.e., mesh points not on the boundary) row by row in the order 


Py, Por, +++, Pr—1,1, Pro, Poo, +++, Pa-22.°°°> 


then the system of equations has the (n — 1)? X (n — 1)? coefficient matrix 


3) A= : ‘ Here B= 
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is an (n — 1) X (n — 1) matrix. (In (12) we have n = 3, (n — 1)? = 4 internal mesh points, two submatrices 
B, and two submatrices I.) The matrix A is nonsingular. This follows by noting that the off-diagonal entries in 
each row of A have the sum 3 (or 2), whereas each diagonal entry of A equals —4, so that nonsingularity is 
implied by Gerschgorin’s theorem in Sec. 20.7 because no Gerschgorin disk can include 0. 


A matrix is called a band matrix if it has all its nonzero entries on the main diagonal 
and on sloping lines parallel to it (separated by sloping lines of zeros or not). For example, 
A in (13) is a band matrix. Although the Gauss elimination does not preserve zeros between 
bands, it does not introduce nonzero entries outside the limits defined by the original 
bands. Hence a band structure is advantageous. In (13) it has been achieved by carefully 
ordering the mesh points. 


ADI Method 


A matrix is called a tridiagonal matrix if it has all its nonzero entries on the main 
diagonal and on the two sloping parallels immediately above or below the diagonal. (See 
also Sec. 20.9.) In this case the Gauss elimination is particularly simple. 

This raises the question of whether, in the solution of the Dirichlet problem for the 
Laplace or Poisson equations, one could obtain a system of equations whose coefficient 
matrix is tridiagonal. The answer is yes, and a popular method of that kind, called the 
ADI method (alternating direction implicit method) was developed by Peaceman and 
Rachford. The idea is as follows. The stencil in (9) shows that we could obtain a tridiagonal 
matrix if there were only the three points in a row (or only the three points in a column). 
This suggests that we write (11) in the form 


(14a) Uj—1,5 — Aug + Ui41,j = Ui, j—-1 — Wi,j41 


so that the left side belongs to y-Row j only and the right side to x-Column 7. Of course, 
we can also write (11) in the form 


(14b) Ui,j-1 — 4uy + Ui j41 = —Ui-1,j — Wi41,j 


so that the left side belongs to Column i and the right side to Row j. In the ADI method 
we proceed by iteration. At every mesh point we choose an arbitrary starting value us? 
In each step we compute new values at all mesh points. In one step we use an iteration 
formula resulting from (14a) and in the next step an iteration formula resulting from (14b), 
and so on in alternating order. 


In detail: suppose approximations ug” have been computed. Then, to obtain the next 


approximations une), we substitute the ug” on the right side of (14a) and solve for the 
Te on the left side; that is, we use 

(m+1) (m+1) Gn) — (m) (C79) 
(15a) Weeig = 4 Pg = =i Wy et 


We use (15a) for a fixed j, that is, for a fixed row j, and for all internal mesh points in 
this row. This gives a linear system of N algebraic equations (V = number of internal 
mesh points per row) in VN unknowns, the new approximations of u at these mesh points. 
Note that (15a) involves not only approximations computed in the previous step but also 
given boundary values. We solve the system (15a) (j fixed!) by Gauss elimination. Then 
we go to the next row, obtain another system of N equations and solve it by Gauss, and 
so on, until all rows are done. In the next step we alternate direction, that is, we compute 
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EXAMPLE 2 


the next approximations vidi column by column from the ao and the given boundary 


values, using a formula obtained from (14b) by substituting the i on the right: 


(m+2) (m+2) Cm+2) _ (m+1) (m+1) 
(15b) U4 5-1 = 4uj; ata U4 541 = ~Uy-1,5 = U4+1,9 5 


For each fixed i, that is, for each column, this is a system of M equations (M = number 
of internal mesh points per column) in M unknowns, which we solve by Gauss elimination. 
Then we go to the next column, and so on, until all columns are done. 

Let us consider an example that merely serves to explain the entire method. 


Dirichlet Problem. ADI Method 


Explain the procedure and formulas of the ADI method in terms of the problem in Example 1, using the same 
grid and starting values 100, 100, 100, 100. 

Solution. While working, we keep an eye on Fig. 455b and the given boundary values. We obtain first 
approximations uG, uS2, ud, uSB from (15a) with m = 0. We write boundary values contained in (15a) without 
an upper index, for better identification and to indicate that these given values remain the same during the 
iteration. From (15a) with m = 0 we have for j = | (first row) the system 


(= 1) uo. - Au? + uSy = —uy49 — u 
(i = 2) uy? 4u$) + U3, 29 ud. 


The solution is upp us? = 100. For j = 2 (second row) we obtain from (15a) the system 


G=1) upg — 4uSh + uBB = —u9 — u43 
(i = 2) uy 4uSB + U39 us? U93. 


The solution is uy = uss = 66.667. 


Second approximations u?, u&, wu, uB are now obtained from (15b) with m= 1 by using the first 


approximations just computed and the boundary values. For i = | (first column) we obtain from (15b) the system 
5 2) (2) = a 
G=1) uy — 4uP + vB = —uo1 — us? 
(j = 2) uf — 4u + ur3 Ugg — usy. 


The solution is uf? = 91.11, uf? = 64.44, For i = 2 (second column) we obtain from (15b) the system 


_ @) eo) __ a 
(i= 1) u20 — 4ugy + uss = —ujy — Us31 


(j = 2) uS) Aus + U93 up U39. 


The solution is u¥ = 91.11, uw = 64.44. 

In this example, which merely serves to explain the practical procedure in the ADI method, the accuracy of 
the second approximations is about the same as that of two Gauss-Seidel steps in Sec. 20.3 (where 
Uyy = X41, Ug1 = X2Q, Uyg = X3, Ugg = Xq), as the following table shows. 


Method U4 Up Uy Ugg 
ADI, 2nd approximations 91.11 91.11 64.44 64.44 
Gauss-Seidel, 2nd approximations 93.75 90.62 65.62 64.06 
Exact solution of (12) 87.50 87.50 62.50 62.50 a 
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Improving Convergence. Additional improvement of the convergence of the ADI 
method results from the following interesting idea. Introducing a parameter p, we can also 
write (11) in the form 


a6 (a) uj1j — 2 + plug + Ujsa,j = —Uigj-1 + 2 — plug — Uigjad 
(b) Wij — (2 + pug + UG ju = —Ui-,g + 2 — Plug — Wisr,;- 
This gives the more general ADI iteration formulas 
1 1 1 
ag @ wig — 2 + pug? + ust? = ula + 2 — pug? — ups 
(b) unt? — (2 zis pug’? 4 a = —— ugtp ae (2 - pugreed — eee 


For p = 2, this is (15). The parameter p may be used for improving convergence. Indeed, 
one can show that the ADI method converges for positive p, and that the optimum value 
for maximum rate of convergence is 


. 7 
(18) Po = 2 sin xk 


where K is the larger of M + 1 and N + 1 (see above). Even better results can be achieved 
by letting p vary from step to step. More details of the ADI method and variants are 
discussed in Ref. [E25] listed in App. 1. 


PROBLEM SET 21-4 


1. Derive (5b), (6b), and (6c). 


. Verify the calculations in Example | of the text. Find 


out experimentally how many steps you need to obtain 
the solution of the linear system with an accuracy of 3S. 


. Use of symmetry. Conclude from the boundary values 


in Example 1 that wo; = uy, and ugg = uy. Show 
that this leads to a system of two equations and solve it. 


. Finer grid of < x 3 inner points. pie Example 1, 


choosing h = 4 = 3 (instead of h = 
same starting lies, 


= 4) and the 


5-10 


GAUSS ELIMINATION, GAUSS-—SEIDEL 


ITERATION 


Fig. 456. Problems 5-10 


10. 


. Up on the upper and lower edges, 


For the grid in Fig. 456 compute the potential at the 
four internal points by Gauss and by 5 Gauss-Seidel 
steps with starting values 100, 100, 100, 100 (showing 
the details of your work) if the boundary values on the 
edges are: 


. u(1, 0) = 60, u(2, 0) = 300, u = 100 on the other 


three edges. 


. uw = 0 on the left, x? on the lower edge, 27 — Oy on 


the right, x? — 27x on the upper edge. 


—Up on the left and 
right. Sketch the equipotential lines. 


. u = 220 on the upper and lower edges, 110 on the left 


and right. 


.u=sin 4 aTKx on the upper edge, 0 on the other edges, 


10 steps. 


u = x* on the lower edge, 81 — 54y a y* on the right, 

4 _ 54x? + 81 on the upper edge, y* on the left. 
Verify the exact solution x* — 6x?y? + y* and 
determine the error. 
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11. 


12. 


13. 


14. 


Find the potential in Fig. 457 using (a) the coarse 
grid, (b) the fine grid 5 < 3, and Gauss elimination. 
Hint. In (b), use symmetry; take u = O as boundary 
value at the two points at which the potential has a 
jump. 


u=110V 

w=110V P,, |u=110V 
Py 

u=-110V u=-110V 
u=-110V 


Fig. 457. Region and grids in Problem 11 


Influence of starting values. Do Prob. 9 by Gauss— 
Seidel, starting from 0. Compare and comment. 


For the square 0 = x = 4,0 Sy S 4 let the boundary 
temperatures be 0°C on the horizontal and 50°C on the 
vertical edges. Find the temperatures at the interior 
points of a square grid with h = 1. 


Using the answer to Prob. 13, try to sketch some 
isotherms. 


15. 


16. 


17. 


18. 
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Find the isotherms for the square and grid in Prob. 13 
if u = sin 4x on the horizontal and —sin 4 Ty on the 
vertical edges. Try to sketch some isotherms. 


ADI. Apply the ADI method to the Dirichlet problem 
in Prob. 9, using the grid in Fig. 456, as before and 
starting values zero. 


What po in (18) should we choose for Prob. 16? Apply 
the ADI formulas (17) with that value of pg to Prob. 16, 
performing 1 step. Illustrate the improved convergence 
by comparing with the corresponding values 0.077, 
0.308 after the first step in Prob. 16. (Use the starting 
values zero.) 


CAS PROJECT. Laplace Equation. (a) Write a 
program for Gauss-Seidel with 16 equations in 16 
unknowns, composing the matrix (13) from the indicated 
4 X 4 submatrices and including a transformation of 
the vector of the boundary values into the vector b of 
Ax = b. 


(b) Apply the program to the square grid in0 = x $5, 
0 = y $5 with h = 1 and u = 220 on the upper and 
lower edges, u = 110 on the left edge and u = —10 
on the right edge. Solve the linear system also by Gauss 
elimination. What accuracy is reached in the 20th 
Gauss-Seidel step? 


21.5 Neumann and Mixed Problems. 


Irregular Boundary 


We continue our discussion of boundary value problems for elliptic PDEs in a region R 
in the xy-plane. The Dirichlet problem was studied in the last section. In solving Neumann 
and mixed problems (defined in the last section) we are confronted with a new situation, 
because there are boundary points at which the (outer) normal derivative u,, = du/dn of 
the solution is given, but u itself is unknown since it is not given. To handle such points 
we need a new idea. This idea is the same for Neumann and mixed problems. Hence we 
may explain it in connection with one of these two types of problems. We shall do so and 
consider a typical example as follows. 


EXAMPLE 1 


Mixed Boundary Value Problem for a Poisson Equation 


Solve the mixed boundary value problem for the Poisson equation 


Vu 


Urge 1 Uyy 


f(y) = 12xy 
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shown in Fig. 458a. 


u=3y° 


0) y. 15: 


u=0 
(a) Region R and boundary values (b) Grid (fh = 0.5) 


Fig. 458. Mixed boundary value problem in Example 1 


Solution. We use the grid shown in Fig. 458b, where h = 0.5. We recall that (7) in Sec. 21.4 has the right 
side h?f(x, y) = 0.5% + 12xy = 3xy. From the formulas u = 3y and uw, = 6x given on the boundary we compute 
the boundary data 


duy2 Ouy2 6-0.5 3 du292 Ouse 
an dy : : an dy 


(1) U3) = 0.375, u32 3; 6-1 6. 


Py, and Py, are internal mesh points and can be handled as in the last section. Indeed, from (7), Sec. 21.4, with 
h? = 0.25 and hf (x, y) = 3xy and from the given boundary values we obtain two equations corresponding to 
Py, and P51, as follows (with —O resulting from the left boundary). 


12(0.5 + 0.5)+4 —0=0.75 


—4uy, + uo, + uy2 
(2a) 


Uy, — 4u91 + ugg = 12(1 + 0.5) +4 — 0.375 = 1.125. 
The only difficulty with these equations seems to be that they involve the unknown values wy and uge of u at 
Pig and Pog on the boundary, where the normal derivative u,, = du/dn = du/dy is given, instead of u; but we 
shall overcome this difficulty as follows. 

We consider Pyg and Po2. The idea that will help us here is this. We imagine the region R to be extended 
above to the first row of external mesh points (corresponding to y = 1.5), and we assume that the Poisson 
equation also holds in the extended region. Then we can write down two more equations as before (Fig. 458b) 


Uy — 4Auyo + Ugg + 443 =15-0=15 
(2b) 


Ug, + Uy — 4Ug2 + uo3 = 3-3 = 0. 


On the right, 1.5 is 12xyh? at (0.5, 1) and 3 is 12xyh? at (1, 1) and 0 (at Pog) and 3 (at P32) are given boundary 
values. We remember that we have not yet used the boundary condition on the upper part of the boundary of 
R, and we also notice that in (2b) we have introduced two more unknowns 43, U3. But we can now use that 
condition and get rid of 3, u23 by applying the central difference formula for du/dy. From (1) we then obtain 
(see Fig. 458b) 


duj2° M3 — 44 


= 43 — W441, hence yg = Uy, + 3 
dy 2h 
du22 ug3 — Ugy 
— = U23 — U4, hence Uo3 = Uo, + 6. 
dy 2h 


Substituting these results into (2b) and simplifying, we have 


2u41 Auyo t ug2 1 3 1S 


2ug1 + Uy — 4uag9 =3 -—3-6 6. 
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Together with (2a) this yields, written in matrix form, 


-4 1 1 O]fuy 0.75 0.75 
1-4 0 1] J uy 1.125 1.125 
(3) = = 
2 0 -4 1] uy 15-3 -1.5 
0 2 i —4 ug92 0=—6 —6 


(The entries 2 come from u 13 and ugg, and so do —3 and —6 on the right). The solution of (3) (obtained by 
Gauss elimination) is as follows; the exact values of the problem are given in parentheses. 


Uy = 0.866 (exact 1) Ugg = 1.812 (exact 2) 


Uy, = 0.077 (exact 0.125) Ug, = 0.191 (exact 0.25). @ 


Irregular Boundary 


We continue our discussion of boundary value problems for elliptic PDEs in a region R 
in the xy-plane. If R has a simple geometric shape, we can usually arrange for certain 
mesh points to lie on the boundary C of R, and then we can approximate partial derivatives 
as explained in the last section. However, if C intersects the grid at points that are not 
mesh points, then at points close to the boundary we must proceed differently, as follows. 

The mesh point O in Fig. 459 is of that kind. For O and its neighbors A and P we obtain 
from Taylor’s theorem 


duo 1 2 d7u9 
(a) Ua =UQ+t+ah + —(ah) yo Ps 
ox 2 ox 
(4) : : 
duo Ce) uo 
b = h +h 
(OD) Be Bo Ox 2 ax? 


We disregard the terms marked by dots and eliminate dug/dx. Equation (4b) times a plus 
equation (4a) gives 


2 duo 
ax? * 


1 
ee aes a es + l)h 


Fig. 459. Curved boundary C of a region R, a mesh point O near C, 
and neighbors A, B, P, Q 


We solve this last equation algebraically for the derivative, obtaining 


ug: 2 1 iol 1 
tad u Uu 
a Rlat+a * i+e 7? a 


uo 
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Similarly, by considering the points O, B, and Q, 


dup _ 2 1 ae! 1 
= u ——tlo = Su 
ay? lb +b) 2 > 1+b 2% 5b? 


By addition, 


2u0 ~ + + 
Gy -Yvo alta) bd+b) 1l+a 1+b ab 


2, UA uBR Up (a + | 
h : 


For example, if a = 5, b= 5, instead of the stencil (see Sec. 21.4) 


I 3 
1 -4 1 we now have 5 —4 3 
I 3 


because 1/[a(1 + a)] = 3 etc. The sum of all five terms still being zero (which is useful 
for checking). 
Using the same ideas, you may show that in the case of Fig. 460. 


2 u Uu Uu ug ap+b 
(6) | Vue == ney eet ee ee |e 
h°|aat+p) bb+q ppta) qqt+b) abpq 


a formula that takes care of all conceivable cases. 


Q 


Fig. 460. Neighboring points A, B, P, Q of a 
mesh point O and notations in formula (6) 


Dirichlet Problem for the Laplace Equation. Curved Boundary 


Find the potential wu in the region in Fig. 461 that has the boundary values given in that figure; here the curved 
portion of the boundary is an arc of the circle of radius 10 about (0,0). Use the grid in the figure. 


Solution. wis a solution of the Laplace equation. From the given formulas for the boundary values u = x, 
u = 512 — 24y?,--- we compute the values at the points where we need them; the result is shown in the figure. 
For Py, and Pyg we have the usual regular stencil, and for P2; and Poy we use (6), obtaining 


1 05 0.9 
(7) Puy, Pio: 1 =4 1?, Poy: 0.6 =2.5 0.9 a Po3: 0.6 =3 0.97. 
1 0.5 0.6 
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uw =x — 243x 


u = 4x° — 300x 
u = -936 


u =-352 


u = 512 —24y* 


uUu=xX 


Fig. 461. Region, boundary values of the potential, and grid in Example 2 


We use this and the boundary values and take the mesh points in the usual order P,y, P21, Pyz, Peo. Then we 
obtain the system 


—4u4, + uaa + Uy = 0-27 = —-27 
0.641 — 2.5u21 0.5u22 0.9 - 296 — 0.5 - 216 = —374.4 
Uy 4uys uo 702 + 0 = 702 


0.6u21 + 0.6uj2 — 3u92 = 0.9 + 352 + 0.9 - 936 = 1159.2 


In matrix form, 


—4 1 1 0 uy. =2) 
0.6 =2.5 0 0.5 | | ver —374.4 
(8) = 
1 0 —4 1 uy2 702 
0 0.6 0.6 = —3 ug9 1159.2 


Gauss elimination yields the (rounded) values 
uy = =59:6; ug, = 49.2, uy2 > —298.5, ug2 = —436.3. 


Clearly, from a grid with so few mesh points we cannot expect great accuracy. The exact solution of the PDE 
(not of the difference equation) having the given boundary values is u = x? — 3xy" and yields the values 


uyy = —54, ug, = 54, u4y2 = =297; ug2 = —432. 


In practice one would use a much finer grid and solve the resulting large system by an indirect method. MM 


PROBLEEM—SET 21-5 


MIXED BOUNDARY VALUE PROBLEMS 


1. Check the values for the Poisson equation at the end 
of Example 1| by solving (3) by Gauss elimination. 


2. Solve the mixed boundary value problem for the 
Poisson equation V2u = 2(x? + y?) in the region and 
for the boundary conditions shown in Fig. 462, using 
the indicated grid. 


Fig. 462. Problems 2 and 6 


93 


3. CAS EXPERIMENT. Mixed Problem. Do Example 
1 in the text with finer and finer grids of your choice 
and study the accuracy of the approximate values by 
comparing with the exact solution u = xy? Verify the 


4. Solve the mixed boundary value problem for the 
Laplace equation Vu = 0 in the rectangle in Fig. 458a 


6 


latter. 


CHAP. 21 Numerics for ODEs and PDEs 


(using the grid in Fig. 458b) and the boundary ) 5 
conditions u, = 0 on the left edge, uv, = 3 on the right e lA 2 : 


edge, u = x2 on the lower edge, and u = x 


the upper edge. 


u = 3x 


Fig. 463. Problem 13 


2 _ 1 on 


5. Do Example 1 in the text for the Laplace equation 14. If, in Prob. 13, the axes are grounded (w = 0), what 


(instead of the Poisson equation) with grid and constant potential must the other portion of the 
boundary data as before. boundary have in order to produce 220 V at Py? 
2 Dee Hl Bahan ane 
6. Solve V°u = —q"y sin 37x for the grid in Fig. 462 15, What potential do we have in Prob. 13 if u = 100 V 
= al = = 

and u,(1, 3) = u,(2, 3) = 3 V243, u = 0 on the other on the axes and u = 0 on the other portion of the 
three sides of the square. boundary? 

7. Solve Prob. 4 when wu, = 110 on the upper edge and 16. Solve the Poisson equation V7u = 2 in the region and 
u = 110 on the other edges. for the boundary values shown in Fig. 464, using the 

grid also shown in the figure. 
8-16; IRREGULAR BOUNDARY 
8. Verify the stencil shown after (5). 


. Derive (5) in the general case. 

. Derive the general formula (6) in detail. 

. Derive the linear system in Example 2 of the text. 
. Verify the solution in Example 2. 


. Solve the Laplace equation in the region and for the 
boundary values shown in Fig. 463, using the 
indicated grid. (The sloping portion of the boundary 


isy =4.5 — x.) 


Fig. 464. Problem 16 


21.6 Methods for Parabolic PDEs 


The last two sections concerned elliptic PDEs, and we now turn to parabolic PDEs. Recall 
that the definitions of elliptic, parabolic, and hyperbolic PDEs were given in Sec. 21.4. 
There it was also mentioned that the general behavior of solutions differs from type to 
type, and so do the problems of practical interest. This reflects on numerics as follows. 

For all three types, one replaces the PDE by a corresponding difference equation, but 
for parabolic and hyperbolic PDEs this does not automatically guarantee the convergence 
of the approximate solution to the exact solution as the mesh h— 0; in fact, it does not 
even guarantee convergence at all. For these two types of PDEs one needs additional 
conditions (inequalities) to assure convergence and stability, the latter meaning that small 
perturbations in the initial data (or small errors at any time) cause only small changes at 
later times. 

In this section we explain the numeric solution of the prototype of parabolic PDEs, the 
one-dimensional heat equation 


uy = C Une (c constant). 
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This PDE is usually considered for x in some fixed interval, say, 0 S x S L, and time 
t = 0, and one prescribes the initial temperature u(x, 0) = f(x) (f given) and boundary 
conditions at x = 0 and x = L for all ¢ = 0, for instance, u(0, t) = 0, u(L, t) = 0. We may 
assume c = | and L = 1; this can always be accomplished by a linear transformation of 
x and ¢t (Prob. 1). Then the heat equation and those conditions are 


(1) Ut = Ugy OZE8= 1720 
(2) u(x, 0) = f(x) (Initial condition) 
(3) u(O0, t) = u(1, t) = O (Boundary conditions). 


A simple finite difference approximation of (1) is [see (6a) in Sec. 21.4; 7 is the number 
of the time step] 


1 il 
(4)  Miitd = uy) = ye ith = Die se WA 


Figure 465 shows a corresponding grid and mesh points. The mesh size is / in the x-direction 
and k in the f-direction. Formula (4) involves the four points shown in Fig. 466. On the left 
in (4) we have used a forward difference quotient since we have no information for negative 
t at the start. From (4) we calculate u; ;.1, which corresponds to time row j + 1, in terms 
of the three other u that correspond to time row j. Solving (4) for u;;.1, we have 


(5) Wjij+1 = (1 — 2rjuy + ruisig + us-1,9)s f=—S: 


Computations by this explicit method based on (5) are simple. However, it can be shown 
that crucial to the convergence of this method is the condition 


k 
r=-9 


IIA 


Nile 


(6) 


= 


u = f(x) 


Fig. 465. Grid and mesh points corresponding to (4), (5) 


(i,j + 1) 
x 
k 
G-1,)) —— (i+ 1,j) 
Co) 
Fig. 466. The four points in (4) and (5) 
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That is, uj; should have a positive coefficient in (5) or (for r = x) be absent from (5). 
Intuitively, (6) means that we should not move too fast in the f-direction. An example is 
given below. 


Crank—Nicolson Method 


Condition (6) is a handicap in practice. Indeed, to attain sufficient accuracy, we have 
to choose small, which makes k very small by (6). For example, if h = 0.1, then 
k = 0.005. Accordingly, we should look for a more satisfactory discretization of the 
heat equation. 

A method that imposes no restriction on r = k/h? is the Crank—Nicolson (CN) 
method,” which uses values of u at the six points in Fig. 467. The idea of the method 
is the replacement of the difference quotient on the right side of (4) by 5 times the 
sum of two such difference quotients at two time rows (see Fig. 467). Instead of (4) 
we then have 


1 1 
yj Midtd — Ky) = ape ith — Quy = + uj4-1,;) 


(7) ; 
+ aye Mit hatd = Daj get P Wij 44): 


Multiplying by 2k and writing r = k/ h” as before, we collect the terms corresponding to 
time row j + | on the left and the terms corresponding to time row j on the right: 


CI (2 ar Dy ag) = Aa pak ae i pe = (CO = Minis ae dW 4 or Wa 


How do we use (8)? In general, the three values on the left are unknown, whereas the 
three values on the right are known. If we divide the x-interval 0 S x S 1 in (1) inton 
equal intervals, we have n — | internal mesh points per time row (see Fig. 465, where 


n = 4). Then for j = 0 andi = 1,---,n — 1, formula (8) gives a linear system of n — 1 
equations for the n — 1 unknown values uw}, W21,°** , 4—1,1 in the first time row in terms 
of the initial values ugo, Uv10,°°*,%no and the boundary values uo1(= 0), uni (= 0). 


Similarly for 7 = 1,7 = 2, and so on; that is, for each time row we have to solve such a 
linear system of n — 1 equations resulting from (8). 

Although r = k/ h” is no longer restricted, smaller r will still give better results. In 
practice, one chooses a k by which one can save a considerable amount of work, without 


°JOHN CRANK (1916-2006), English mathematician and physicist at Courtaulds Fundamental Research 
Laboratory, professor at Brunel University, England. Student of Sir WILLIAM LAWRENCE BRAGG 
(1890-1971), Australian British physicist, who with his father, Sir WILLIAM HENRY BRAGG (1862-1942) 
won the Nobel Prize in physics in 1915 for their fundamental work in X-ray crystallography. (This is the only 
case where a father and a son shared the Nobel Prize for the same research. Furthermore, W. L. Bragg is the 
youngest Nobel laureate ever.) PHYLLIS NICOLSON (1917-1968), English mathematician, professor at the 
University of Leeds, England. 
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EXAMPLE 1 


making r too large. For instance, often a good choice is r = | (which would be impossible 
in the previous method). Then (8) becomes simply 


(9) Auyj+1 — Mi41,j+1 — Mi-1,j4+1 = Wit1g + Ui-1,;- 
Time rowj+1  %————__ x ——____ x 
k 
Time rowj xXx—_ ——————<<e 
J i x i x 


Fig. 467. The six points in the Crank—Nicolson formulas (7) and (8) 


Fig. 468. Grid in Example 1 


Temperature in a Metal Bar. Crank—Nicolson Method, Explicit Method 


Consider a laterally insulated metal bar of length 1 and such that c” = 1 in the heat equation. Suppose that the 
ends of the bar are kept at temperature u = 0°C and the temperature in the bar at some instant—call it t = 0— 
is f(x) = sin 77x. Applying the Crank—Nicolson method with h = 0.2 and r = 1, find the temperature u(x, ft) in 
the bar for 0 = t = 0.2. Compare the results with the exact solution. Also apply (5) with an r satisfying (6), 
say, r = 0.25, and with values not satisfying (6), say, r = 1 and r = 2.5. 


Solution by Crank-Nicolson. Since r = 1, formula (8) takes the form (9). Since h = 0.2 and 
r=k/ h® = 1, we have k = h” = 0.04. Hence we have to do 5 steps. Figure 468 shows the grid. We shall need 
the initial values 


Uy = sin 0.277 = 0.587785, Ugo = sin 0.477 = 0.951057. 


Also, u39 = Ugo and ugg = Uy. (Recall that wj9 means u at Pyo in Fig. 468, etc.) In each time row in Fig. 
468 there are 4 internal mesh points. Hence in each time step we would have to solve 4 equations in 4 
unknowns. But since the initial temperature distribution is symmetric with respect to x = 0.5, and u = 0 at 
both ends for all t, we have v3) = ug1, U4, = Uy, in the first time row and similarly for the other rows. This 
reduces each system to 2 equations in 2 unknowns. By (9), since u3; = ug, and uo; = 0, for j = 0 these 
equations are 


(G = 1) 4uy4 — ar = Ugo + U29 = 0.951057 


(i = 2) Uy, + 4u91 — U2, = Uyo + Ugg = 1.538842. 


The solution is uw, = 0.399274, ua, = 0.646039. Similarly, for time row j = 1 we have the system 


(G = 1) 4uyo uo2 uol ugi 0.646039 


(G 2) uy2 t 3ug2 U4 ugi 1.045313. 
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The solution is wy = 0.271221, ua = 0.438844, and so on. This gives the temperature distribution 
(Fig. 469): 


t x=0 x = 0.2 x = 0.4 x = 0.6 x = 0.8 x=1 
0.00 0 0.588 0.951 0.951 0.588 0 
0.04 0 0.399 0.646 0.646 0.399 0 
0.08 0 0.271 0.439 0.439 0.271 0 
0.12 0 0.184 0.298 0.298 0.184 0 
0.16 0) 0.125 0.202 0.202 0.125 0 
0.20 0 0.085 0.138 0.138 0.085 0 

u(x, t) 
Ak 


0.5 


(0) 


0 0.5 lx 


Fig. 469. Temperature distribution in the bar in Example 1 


Comparison with the exact solution. The present problem can be solved exactly by separating 
variables (Sec. 12.5); the result is 


(10) u(x, t) = sin axe 7 *, 
Solution by the explicit method (5) with r = 0.25. For h = 0.2 and r = k/h” = 0.25 we have 


k = rh® = 0.25 + 0.04 = 0.01. Hence we have to perform 4 times as many steps as with the Crank—Nicolson 
method! Formula (5) with r = 0.25 is 


(11) U4 j4+1 = 0.25(uj—-1,; + Quy + Uj+1,j) 


We can again make use of the symmetry. For 7 = 0 we need woo = 0, u19 = 0.587785 (see p. 939), 
Ug9 = Ugq = 0.951057 and compute 


u4y > 0.25(ugo ae 2u10 + U0) = 0.531657 


Ug, = 0.25(uyo + 299 + U39) = 0.25(U49 + 329) = 0.860239. 
Of course we can omit the boundary terms ug, = 0, ugg = 0,--: from the formulas. For j = 1 we compute 


Uy2 = 0.25(2u41 + U21) = 0.480888 
ug2 = 0.25(u44 + 3u21) = 0.778094 


and so on. We have to perform 20 steps instead of the 5 CN steps, but the numeric values show that the accuracy 
is only about the same as that of the Crank—Nicolson values CN. The exact 3D-values follow from (10). 
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x = 0.2 x = 0.4 
if 

CN By (11) Exact CN By (11) Exact 
0.04 0.399 0.393 0.396 0.646 0.637 0.641 
0.08 0.271 0.263 0.267 0.439 0.426 0.432 
0.12 0.184 0.176 0.180 0.298 0.285 0.291 
0.16 0.125 0.118 0.121 0.202 0.191 0.196 
0.20 0.085 0.079 0.082 0.138 0.128 0.132 


Failure of (5) with r violating (6). Formula (5) with h = 0.2 and r = 1—which violates (6)—is 
U4 j+1 = U4—-1,j — Ug + Uj+1,5 


and gives very poor values; some of these are 


t x= 0.2 Exact x=04 Exact 
0.04 0.363 0.396 0.588 0.641 
0.12 0.139 0.180 0.225 0.291 
0.20 0.053 0.082 0.086 0.132 


Formula (5) with an even larger r = 2.5 (and h = 0.2 as before) gives completely nonsensical results; some of 


these are 


t x= 0.2 
0.1 0.0265 
0.3 0.0001 


Exact x= 0.4 Exact 
0.2191 0.0429 0.3545 
0.0304 0.0001 0.0492. i 


PROBLEM SET 21-6 


1. Nondimensional form. Show that the heat equation 
itz = c7tizz, 0 S ¥ S L, can be transformed to the 
“nondimensional” standard form u, = u,,,0 =x S 1, 
by setting x = X/L, t = CHL, u = i/up, Where ug is 
any constant temperature. 

2. Difference equation. Derive the difference approxi- 


mation (4) of the heat equation. 
3. Explicit method. Derive (5) by solving (4) for uj; 41. 
4. CAS EXPERIMENT. Comparison of Methods. 
(a) Write programs for the explicit and the Crank— 
Nicolson methods. 
(b) Apply the programs to the heat problem of a 
laterally insulated bar of length 1 with u(x, 0) = sin 77x 
and u(0,t) = u(1,t) = 0 for all 7, using h = 0.2, 
k = 0.01 for the explicit method (20 steps), h = 0.2 
and (9) for the Crank—Nicolson method (5 steps). 
Obtain exact 6D-values from a suitable series and 
compare. 
(c) Graph temperature curves in (b) in two figures 
similar to Fig. 299 in Sec. 12.7. 


(d) Experiment with smaller h (0.1, 0.05, etc.) for both 
methods to find out to what extent accuracy increases 
under systematic changes of h and k. 


EXPLICIT METHOD 


5. Using (5) with h = 1 and k = 0.5, solve the heat 
problem (1)—(3) to find the temperature at t = 2 ina 
laterally insulated bar of length 10 ft and initial 
temperature f(x) = x(1 — 0.1%). 

6. Solve the heat problem (1)—(3) by the explicit method 
with h = 0.2 andk = 0.01, 8 time steps, when f(x) = x 
if 0=x<4,f@ = x at 5 =x = 1. Compare 
with the 3S-values 0.108, 0.175 for t= 0.08, 
x = 0.2,0.4 obtained from the series (2 terms) in 
Sec. 12.5. 

7. The accuracy of the explicit method depends on 
re 3). Illustrate this for Prob. 6, choosing r = 3 (and 
h = 0.2 as before). Do 4 steps. Compare the values for 
t = 0.04 and 0.08 with the 3S-values in Prob. 6, which 
are 0.156, 0.254 (t = 0.04), 0.105, 0.170 (t = 0.08). 
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8. In a laterally insulated bar of length 1 let the initial 
temperature be f(x) = xifO Sx<05,f@) =1-x 
if 0.5 S x S 1. Let (1) and (3) hold. Apply the explicit 
method with h = 0.2, k = 0.01, 5 steps. Can you expect 
the solution to satisfy u(x, ft) = u(1 — x, f) for all r? 

9. Solve Prob. 8 with fZyy=x if OSxS0.2, 
f@) = 0.2501 — x) if 0.2<x 31, the other data 
being as before. 

10. Insulated end. If the left end of a laterally insulated 
bar extending from x = 0 to x = 1 is insulated, the 
boundary condition at x = Ois u,,(0, t) = u,(0, t) = 0. 
Show that, in the application of the explicit method 
given by (5), we can compute uo; +1 by the formula 


Ugj+a = CU — 2r)ugj + 2ruy4;. 


Apply this with h = 0.2 and r = 0.25 to determine the 
temperature u(x, f) in a laterally insulated bar extending 
from x = 0 to | if u(x, 0) = 0, the left end is insulated 
and the right end is kept at temperature g(t) = sin 98 Tt. 
Hint. Use 0 = duoj/x = (uyj — u—4j)/2h. 


CRANK-NICOLSON METHOD 


11. Solve Prob. 9 by (9) with h = 0.2, 2 steps. Compare 
with exact values obtained from the series in Sec. 12.5 
(2 terms) with suitable coefficients. 


12. Solve the heat problem (1)-(3) by Crank—Nicolson 
for 0 St $0.20 with h = 0.2 and k = 0.04 when 
f@M=x if OSx<kf@=1--x if 45x81. 
Compare with the exact values for t = 0.20 obtained 
from the series (2 terms) in Sec. 12.5. 


13-15 


Solve (1)-(3) by Crank—Nicolson with r = 1 (5 steps), 
where: 


13. f@%) =5x if OSx< 0.25, fQ) =125(11—x if 
0.25Sx=1,h=02 


14. f(x) = x11 — x), =A = 0.1. (Compare with Prob. 15.) 
15. f%) =x -— x), h=0.2 


2\.7 Method for Hyperbolic PDEs 


In this section we consider the numeric solution of problems involving hyperbolic PDEs. 
We explain a standard method in terms of a typical setting for the prototype of a hyperbolic 
PDE, the wave equation: 


(1) lie Tie 0Sx51,t20 

(2) u(x, 0) = f(x) (Given initial displacement) 
(3) uz(x, 0) = g(x) (Given initial velocity) 

(4) u(0, t) = u(1, t) = 0 (Boundary conditions). 


Note that an equation uz, = Clie: and another x-interval can be reduced to the form (1) 
by a linear transformation of x and ¢. This is similar to Sec. 21.6, Prob. 1. 

For instance, (1)—(4) is the model of a vibrating elastic string with fixed ends at x = 0 
and x = | (see Sec. 12.2). Although an analytic solution of the problem is given in (13), 
Sec. 12.4, we use the problem for explaining basic ideas of the numeric approach that are 
also relevant for more complicated hyperbolic PDEs. 

Replacing the derivatives by difference quotients as before, we obtain from (1) [see (6) 
in Sec. 21.4 with y = ¢] 


il 1 
(5) pe eat — Quy + ui,j-1) = pe ith — Quy + ui_1,)) 


where h is the mesh size in x, and k is the mesh size in ¢. This difference equation relates 
5 points as shown in Fig. 470a. It suggests a rectangular grid similar to the grids for 
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parabolic equations in the preceding section. We choose r* = k?/ h? = 1. Then uj; drops 
out and we have 


(6) ig pal = hata ar Cesc, — Wi sa (Fig. 470b). 


It can be shown that for 0 < r* S 1 the present explicit method is stable, so that from 
(6) we may expect reasonable results for initial data that have no discontinuities. (For a 
hyperbolic PDE the latter would propagate into the solution domain—a phenomenon that 
would be difficult to deal with on our present grid. For unconditionally stable implicit 
methods see [E1] in App. 1.) 


x Time rowj + 1 e 
E 
ar ao ae Ta. Time rowj x—,—-x 
|e 
x Time rowj — 1 x 
(a) Formula (5) (b) Formula (6) 


Fig. 470. Mesh points used in (5) and (6) 


Equation (6) still involves 3 time steps j — 1,j, 7 + 1, whereas the formulas in the 
parabolic case involved only 2 time steps. Furthermore, we now have 2 initial conditions. 
So we ask how we get started and how we can use the initial condition (3). This can be 
done as follows. 

From uz(x, 0) = g(x) we derive the difference formula 


il 
(7) ak (ui. — Ui,-1) = &: hence Uj,-1 = Uj, — 2kg; 


where g; = g(ih). For tf = 0, that is, 7 = 0, equation (6) is 
Uj = Uj—-1,0 + Ui+1,0 — U4,-1. 
Into this we substitute wu; 1 as given in (7). We obtain 431 = Uj_-1,9 + Uj+1,.0 — Uia + 2kg; 
and by simplification 
1 
(8) Uiy = 9(Ui-1,0 + Ui+1,0) + kei, 


This expresses u;, in terms of the initial data. It is for the beginning only. Then use (6). 


Vibrating String, Wave Equation 


Apply the present method with h = k = 0.2 to the problem (1)-(4), where 

f(x) = sin 7x, g(x) = 0. 
Solution. The grid is the same as in Fig. 468, Sec. 21.6, except for the values of t, which now are 0.2, 0.4, --- 
(instead of 0.04, 0.08,---). The initial values uo0, “49,°** are the same as in Example 1, Sec. 21.6. From (8) 


and g(x) = 0 we have 


_1 
uit = Z(Uj-1,0 F Ui+1,0)- 
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From this we compute, using “49 = Ugg = sin 0.27 = 0.587785, uso = Ugo = 0.951057, 


: 1 
(G@= 1) u11 = goo + Ua20) 


+ 0.951057 = 0.475528 


(i= 2) ua = 3(U10 + Ugo) 


1 
2 
x * 1.538842 = 0.769421 


and ug, = W21, U4, = Uy, by symmetry as in Sec. 21.6, Example 1. From (6) with j = 1 we now compute, 
using Ug, = Ugg = °°: = 0, 


(G = 1) uy42 uol ugy u40 0.769421 — 0.587785 = 0.181636 


Gi = 2) uo2 U4 UZ1 U0 0.475528 + 0.769421 = 0.951057 = 0.293892, 


and Ugo = Ug9, Ugg = Uy by symmetry; and so on. We thus obtain the following values of the displacement 
u(x, t) of the string over the first half-cycle: 


t x=0 x = 0.2 x= 0:4 x = 0.6 x = 0.8 x=1 
0.0 0 0.588 0.951 0.951 0.588 0 
0.2 0 0.476 0.769 0.769 0.476 0 
0.4 0 0.182 0.294 0.294 0.182 0 
0.6 0 —0.182 —0.294 —0.294 —0.182 0 
0.8 0 —0.476 —0.769 —0.769 —0.476 0 
1.0 0 —0.588 —0.951 —0.951 —0.588 0 

These values are exact to 3D (3 decimals), the exact solution of the problem being (see Sec. 12.3) 
u(x, ft) = sin 77x cos Tt. 
The reason for the exactness follows from d’Alembert’s solution (4), Sec. 12.4. (See Prob. 4, below.) | 


This is the end of Chap. 21 on numerics for ODEs and PDEs, a field that continues to 
develop rapidly in both applications and theoretical research. Much of the activity in the 
field is due to the computer serving as an invaluable tool for solving large-scale and 
complicated practical problems as well as for testing and experimenting with innovative 
ideas. These ideas could be small or major improvements on existing numeric algorithms 
or testing new algorithms as well as other ideas. 


PROBLEM SET 21-7 


VIBRATING STRING 4. Another starting formula. Show that (12) in Sec. 12.4 


1-3 


h = k = 0.2 for the given initial deflection f(x) and initial 
velocity 0 on the given f-interval. 


1. f(@) =xif0=x<4, f@) =40 -anifs SxS, 
Ostsl 


2. fx) =2x7 - x3, OS1tS2 


gives the starting formula 
Using the present method, solve (1)-(4) with 


| I ajt+k 
Wil = 5 (uj+1,0 + Ui-1,0) + | g(s) ds 


aj—-k 


(where one can evaluate the integral numerically if 
necessary). In what case is this identical with (8)? 


5. Nonzero initial displacement and speed. IIlustrate the 


3. f(x) = 0.2(% — 2"), e722 starting procedure when both f and g are not identically 
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6. 


7. 


zero, say, f(x) = 1—cos27x, g(x) = x(1 — x), 
h=k=0.1, 2 time steps. 
Solve (1)-(3) (h = k = 0.2,5 time steps) subject to 


f(x) = x”, g(x) = 2x, u(0, ) = 24, ud.) = 1 + 9. 
Zero initial displacement. If the string governed by the 
wave equation (1) starts from its equilibrium position with 
initial velocity g(x) = sin 77x, what is its displacement 
at time t = 0.4 and x = 0.2, 0.4, 0.6, 0.8? (Use the 
present method with h = 0.2, k = 0.2. Use (8). Compare 
with the exact values obtained from (12) in Sec. 12.4.) 


8. 


10. 
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Compute approximate values in Prob. 7, using a finer 
grid (h = 0.1, k = 0.1), and notice the increase in 
accuracy. 


. Compute uw in Prob. 5 for t=0.1 and x= 0.1, 


0.2,---, 0.9, using the formula in Prob. 8, and compare 
the values. 


Show that from d’ Alembert’s solution (13) in Sec.12.4 
with c = 1 it follows that (6) in the present section 
gives the exact value uj j41 = u(ih, (j + Ih). 


CHAPTER 217 REVIEW QUESTIONS AND PROBLEMS 


1. 


12. 


13. 


14. 


15. 


16. 


Explain the Euler and improved Euler methods 
in geometrical terms. Why did we consider these 
methods? 


. How did we obtain numeric methods from the Taylor 


series? 


. What are the local and the global orders of a method? 


Give examples. 


. Why did we compute auxiliary values in each Runge— 


Kutta step? How many? 


. What is adaptive integration? How does its idea extend 


to Runge-Kutta? 


. What are one-step methods? Multistep methods? The 


underlying ideas? Give examples. 


. What does it mean that a method is not self-starting? 


How do we overcome this problem? 


. What is a predictor—corrector method? Give an 


important example. 


. What is automatic step size control? When is it needed? 


How is it done in practice? 


. How do we extend Runge-Kutta to systems of ODEs? 
11. 


Why did we have to treat the main types of PDEs in 
separate sections? Make a list of types of problems and 
numeric methods. 


When and how did we use finite differences? Give as 
many details as you can remember without looking 
into the text. 


How did we approximate the Laplace and Poisson 
equations? 

How many initial conditions did we prescribe for the 
wave equation? For the heat equation? 


Can we expect a difference equation to give the exact 
solution of the corresponding PDE? 


In what method for PDEs did we have convergence 
problems? 


17. 


18. 


19. 


20. 


21. 


22. 


23. 


25. 


26. 


27. 


28. 


Solve y’ = y, (0) = 1 by Euler’s method, 10 steps, 
h=0.1. 

Do Prob. 17 with h = 0.01, 10 steps. Compute the errors. 
Compare the error for x = 0.1 with that in Prob. 17. 
Solve y’ = 1 + y”, y(0) = 0 by the improved Euler 
method, h = 0.1, 10 steps. 

Solve y +y=(«+ 1)’, y(0) = 3 by the improved 
Euler method, 10 steps with h = 0.1. Determine the 
errors. 


Solve Prob. 19 by RK with h = 0.1, 5 steps. Compute 
the error. Compare with Prob. 19. 


Fair comparison. Solve y’ = 2x7!Vy — Inx + x7}, 
y(1) = 0 for 1 S x = 1.8 (a) by the Euler method with 
h = 0.1, (b) by the improved Euler method with 
h = 0.2, and (c) by RK with h = 0.4. Verify that the 
exact solution is y = (In x + Inx. Compute and 
compare the errors. Why is the comparison fair? 


Apply the Adams—Moulton method to y’ = V1 — y’, 
y(0)=0, A=0.2, x=0,-::, 1, starting with 


0.198668, 0.389416, 0.564637. 


. Apply the A-M method to y’ = (x + y — 4)”, (0) = 4, 


h = 0.2, x = 0,---, 1, starting with 4.00271, 4.02279, 
4.08413. 

Apply Euler’s method for systems to y” = xy, 
y(0) = 1, y'(0) = 0, = 0.1, 5 steps. 

Apply Euler’s method for systems to y, = yo, 
ys = —4y1, y1(0) = 2, yo(0) = 0, h = 0.2, 10 steps. 
Sketch the solution. 

Apply Runge-Kutta for systems to y” + y = 2e”, 
y(0) = 0, y’(0) = 1, h = 0.2, 5 steps. Determine the 
errors. 

Apply Runge-Kutta for systems to yy = 6y; + 9yo, 
y2 = 1 + 6y2, yi(0) = —3, yo(0) = —3, h = 0.05, 
3 steps. 
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29. Find rough approximate values of the electrostatic u(1, t) = 0 by the method in Sec. 21.7 with h = 0.1 
potential at P,1, Py2, Pig in Fig. 471 that lie in a field and k = 0.1 for t = 0.3. 
between conducting plates (in Fig. 471 appearing as 
sides of a rectangle) kept at potentials 0 and 220 V as 32-34! POTENTIAL 
shown. (Use the indicated grid.) 


Find the potential in Fig. 472, using the given grid and the 
y boundary values: 
u=220V 
32. u(Po1) = u(Pos) = u(Pa1) = u(Pag) = 200, 
u(Pio) = u(P39) = —400, u(Poo) = 1600, 
u(Fo2) = u(Pag) = u(Pi4) = u(Pea) = u(P3q) = 0 


33. u(Pyo9) = u(P39) = 960, u(P9) = —480, u=0 
u=0 elsewhere on the boundary 


34. wu = 70 on the upper and left sides, uw = 0 on the lower 
and right sides 


Fig. 471. Problem 29 


30. A laterally insulated homogeneous bar with ends at 
x = 0 and x = 1 has initial temperature 0. Its left end 
is kept at 0, whereas the temperature at the right end 
varies sinusoidally according to 


u(t, 1) = g(t) = sin 2 wt. 
Fig. 472. Problems 32-34 
Find the temperature u(x, t) in the bar [solution of (1) 


se te by - aes ene esr =the and 35. Solve uz = Uy (0 Sx S1,t2 0), 

pe ne Pn seme u(x, 0) = x2(1 — x), (0, 1) = u(1, ) = 0 by Crank 
31. Find the solution of the vibrating string problem Nicolson with h = 0.2, k = 0.04, 5 time steps. 

Ur = Ugy, Ux,0)=x1-— x), u=0, uO, = 


SUMMARY-OF-CHAPTER-21 


Numerics for ODEs and PDEs 


In this chapter we discussed numerics for ODEs (Secs. 21.1—21.3) and PDEs (Secs. 
21.4-21.7). Methods for initial value problems 


(1) y =f(x,y), yo) = yo 


involving a first-order ODE are obtained by truncating the Taylor series 


2 
ye + h) = y(x) + hy") + Fy" aa 


Summary of Chapter 21 
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where, by (1), y =f,y” =f" = af/ax + (Af/dy)y’, etc. Truncating after the term 
hy’, we get the Euler method, in which we compute step by step 


(2) Yn+1 = Yn + h(n, Yn) (n = 0, 1,---). 


Taking one more term into account, we obtain the improved Euler method. Both 
methods show the basic idea but are too inaccurate in most cases. 

Truncating after the term in h*, we get the important classical Runge-Kutta 
(RK) method of fourth order. The crucial idea in this method is the replacement 
of the cumbersome evaluation of derivatives by the evaluation of f(x, y) at 
suitable points (x, y); thus in each step we first compute four auxiliary quantities 
(Sec. 21.1) 


ky = hf (tn Yn) 
ke = hfen + 3h, yn + 3k) 
ks = hf(tn + 3h, yn + 3k2) 
ka = hfxn + hy yn + kg) 


(3a) 


and then the new value 
(3b) Yn+1 = Yn + e(k1 + 2ko + 2kg + ka). 


Error and step size control are possible by step halving or by RKF 
(Runge—Kutta—Fehlberg). 

The methods in Sec. 21.1 are one-step methods since they get y,+1 from the 
result y, of a single step. A multistep method (Sec. 21.2) uses the values of 
Yn Yn-1,'°* Of several steps for computing y,+1. Integrating cubic interpolation 
polynomials gives the Adams-Bashforth predictor (Sec. 21.2) 


(4a) Vio = Yn + sahOGSn = 5%fe-1 + 37fa—2 — Wns) 


where f; = f(x;, y;), and an Adams—Moulton corrector (the actual new value) 
(4b) Yn+1 = In + sghOfns1 + 19, — Sf-1 + fr—2)s 
where f#,41 = f(%n+1, y4+1)- Here, to get started, yy, ye, yz must be computed by 


the Runge-Kutta method or by some other accurate method. 
Section 19.3 concerned the extension of Euler and RK methods to systems 


y =f(x y), thus Yj = fi, Y1,°°* Ym) j=ljyctym. 


This includes single mth-order ODEs, which are reduced to systems. Second-order 
equations can also be solved by RKN (Runge—Kutta—Nystr6m) methods. These are 
particularly advantageous for y” = f(x, y) with f not containing y’. 
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Numeric methods for PDEs are obtained by replacing partial derivatives by 
difference quotients. This leads to approximating difference equations, for the 
Laplace equation to 
(5) Uj4+1,7 ae Uij4+1 + Uj-1,j =F Ujj-1 — 4ui; =0 (Sec. 21.4) 


for the heat equation to 
1 1 
(6) k (Uj,j+1 — Uy) = pe ith — Quy + ui-1,4) (Sec. 21.6) 


and for the wave equation to 


1 1 
(7) 2 (Uj joa — 255 + Uyj—1) = pe ith — Quy + uj-1,;) (Sec. 21.7); 


here h and k are the mesh sizes of a grid in the x- and y-directions, respectively, 
where in (6) and (7) the variable y is time ¢. 

These PDEs are elliptic, parabolic, and hyperbolic, respectively. Corresponding 
numeric methods differ, for the following reason. For elliptic PDEs we have 
boundary value problems, and we discussed for them the Gauss—Seidel method 
(also known as Liebmann’s method) and the ADI method (Secs. 21.4, 21.5). For 
parabolic PDEs we are given one initial condition and boundary conditions, and 
we discussed an explicit method and the Crank—Nicolson method (Sec. 21.6). For 
hyperbolic PDEs, the problems are similar but we are given a second initial 
condition (Sec. 21.7). 


Optimization, 
Graphs 


CHAPTER 22 _——Unconstrained Optimization. Linear Programming 
CHAPTER 23 ~=Graphs. Combinatorial Optimization 


The material of Part F is particularly useful in modeling large-scale real-world problems. 
Just as it is in numerics in Part E, where the greater availability of quality software and 
computing power is a deciding factor in the continued growth of the field, so it is also in 
the fields of optimization and combinatorial optimization. Problems, such as optimizing 
production plans for different industries (microchips, pharmaceuticals, cars, aluminum, 
steel, chemicals), optimizing usage of transportation systems (usage of runways in airports, 
tracks of subways), efficiency in running of power plants, optimal shipping (delivery 
services, shipping of containers, shipping goods from factories to warehouses and from 
warehouses to stores), designing optimal financial portfolios, and others are all examples 
where the size of the problem usually requires the use of optimization software. More 
recently, environmental concerns have put new aspects into the picture, where an important 
concern, added to these problems, is the minimization of environmental impact. The main 
task becomes to model these problems correctly. The purpose of Part F is to introduce 
the main ideas and methods of unconstrained and constrained optimization (Chap. 22), 
and graphs and combinatorial optimization (Chap. 23). 


Chapter 22 introduces unconstrained optimization by the method of steepest descent and 
constrained optimization by the versatile simplex method. The simplex method (Secs. 
22.3, 22.4) is very useful for solving many linear optimization problems (also called linear 
programming problems). 


Graphs \et us model problems in transportation logistics, efficient use of communication 
networks, best assignment of workers to jobs, and others. We consider shortest path problems 
(Secs. 22.2, 22.3), shortest spanning trees (Secs. 23.4, 23.5), flow problems in networks (Secs. 
23.6, 23.7), and assignment problems (Sec. 23.8). We discuss algorithms of Moore, Dijkstra 
(both for shortest path), Kruskal, Prim (shortest spanning trees), and Ford—Fulkerson (for flow). 
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CHAPTER 2 2 


Unconstrained Optimization. 
Linear Programming 


Optimization is a general term used to describe types of problems and solution techniques 
that are concerned with the best (“optimal”’) allocation of limited resources in projects. The 
problems are called optimization problems and the methods optimization methods. Typical 
problems are concerned with planning and making decisions, such as selecting an optimal 
production plan. A company has to decide how many units of each product from a choice 
of (distinct) products it should make. The objective of the company may be to maximize 
overall profit when the different products have different individual profits. In addition, the 
company faces certain limitations (constraints). It may have a certain number of machines, 
it takes a certain amount of time and usage of these machines to make a product, it requires 
a certain number of workers to handle the machines, and other possible criteria. To solve 
such a problem, you assign the first variable to number of units to be produced of the first 
product, the second variable to the second product, up to the number of different (distinct) 
products the company makes. When you multiply these, for example, by the price, you 
obtain a linear function called the objective function. You also express the constraints in 
terms of these variables, thereby obtaining several inequalities, called the constraints. 
Because the variables in the objective function also occur in the constraints, the objective 
function and the constraints are tied mathematically to each other and you have set up a 
linear optimization problem, also called a linear programming problem. 

The main focus of this chapter is to set up (Sec. 22.2) and solve (Secs. 22.3, 22.4) such 
linear programming problems. A famous and versatile method for doing so is the simplex 
method. In the simplex method, the objective function and the constraints are set up in 
the form of an augmented matrix as in Sec. 7.3, however, the method of solving such 
linear constrained optimization problems is a new approach. 

The beauty of the simplex method is that it allows us to scale problems up to thousands 
or more constraints, thereby modeling real-world situations. We can start with a small 
model and gradually add more and more constraints. The most difficult part is modeling 
the problem correctly. The actual task of solving large optimization problems is done by 
software implementations for the simplex method or perhaps by other optimization methods. 

Besides optimal production plans, problems in optimal shipping, optimal location of 
warehouses and stores, easing traffic congestion, efficiency in running power plants are 
all examples of applications of optimization. More recent applications are in minimizing 
environmental damages due to pollutants, carbon dioxide emissions, and other factors. 
Indeed, new fields of green logistics and green manufacturing are evolving and naturally 
make use of optimization methods. 


Prerequisite: a modest working knowledge of linear systems of equations. 
References and Answers to Problems: App. | Part F, App. 2. 
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22.| Basic Concepts. 


Unconstrained Optimization: 
Method of Steepest Descent 


In an optimization problem the objective is to optimize (maximize or minimize) some 
function f. This function f is called the objective function. It is the focal point or goal of 
our optimization problem. 

For example, an objective function fto be maximized may be the revenue in a production 
of TV sets, the rate of return of a financial portfolio, the yield per minute in a chemical 
process, the mileage per gallon of a certain type of car, the hourly number of customers 
served in a bank, the hardness of steel, or the tensile strength of a rope. 

Similarly, we may want to minimize f if f is the cost per unit of producing certain 
cameras, the operating cost of some power plant, the daily loss of heat in a heating system, 
COz emissions from a fleet of trucks for freight transport, the idling time of some lathe, 
or the time needed to produce a fender. 

In most optimization problems the objective function f depends on several variables 


Mis? ys 


These are called control variables because we can “control” them, that is, choose their values. 

For example, the yield of a chemical process may depend on pressure x, and temperature 
xg. The efficiency of a certain air-conditioning system may depend on temperature xj, air 
pressure x2, moisture content x3, cross-sectional area of outlet x4, and so on. 

Optimization theory develops methods for optimal choices of x1, ++, x», which maximize 
(or minimize) the objective function f, that is, methods for finding optimal values of x1, -+- , Xp. 

In many problems the choice of values of x1,---, x, is not entirely free but is subject 
to some constraints, that is, additional restrictions arising from the nature of the problem 
and the variables. 

For example, if x, is production cost, then x, = 0, and there are many other variables 
(time, weight, distance traveled by a salesman, etc.) that can take nonnegative values only. 
Constraints can also have the form of equations (instead of inequalities). 

We first consider unconstrained optimization in the case of a function f(4%q, °°: , Xn). 
We also write x = (%1,°°:,X,,) and f(x), for convenience. 


By definition, f has a minimum at a point x = Xo in a region R (where f is defined) if 
F(x) = f(Xo) 
for all x in R. Similarly, f has a maximum at Xp in R if 
F(x) = f(Xo) 


for all x in R. Minima and maxima together are called extrema. 
Furthermore, f is said to have a local minimum at Xo if 


F(x) = f(Xo) 
for all x in a neighborhood of Xo, say, for all x satisfying 
x= 50 = (Gi ay eo Ge Ke 


where Xo = (Xj,°:-, X,) and r > 0 is sufficiently small. 
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Similarly, f has a local maximum at Xo if f(x) S f(Xo) for all x satisfying Ix — Xol <r. 

If fis differentiable and has an extremum at a point Xo in the interior of a region R 
(that is, not on the boundary), then the partial derivatives df/dx,,--+, 0f/0x, must be zero 
at Xo. These are the components of a vector that is called the gradient of f and denoted 
by grad f or Vf. (For n = 3 this agrees with Sec. 9.7.) Thus 


(1) Vf(X) = 0. 


A point Xo at which (1) holds is called a stationary point of f. 

Condition (1) is necessary for an extremum of f at Xo in the interior of R, but is not 
sufficient. Indeed, if n = 1, then for y = f(x), condition (1) is y’ =f (Xo) = 0; and, for 
instance, y = x? satisfies y = 3x 0 at x = Xp = 0 where f has no extremum but a 
point of inflection. Similarly, for f(x) = x1x2g we have Vf(0) = 0, and f does not have an 
extremum but has a saddle point at 0. Hence, after solving (1), one must still find out 
whether one has obtained an extremum. In the case n = | the conditions y'(X) = 0, 
y"(Xo) > 0 guarantee a local minimum at Xo and the conditions y'(Xo) = 0, y"(Xo) <Oa 
local maximum, as is known from calculus. Form > | there exist similar criteria. However, 
in practice, even solving (1) will often be difficult. For this reason, one generally prefers 
solution by iteration, that is, by a search process that starts at some point and moves 
stepwise to points at which f is smaller (if a minimum of f is wanted) or larger (in the 
case of a maximum). 

The method of steepest descent or gradient method is of this type. We present it here 
in its standard form. (For refinements see Ref. [E25] listed in App. 1.) 

The idea of this method is to find a minimum of f(x) by repeatedly computing minima 
of a function g(t) of a single variable ¢, as follows. Suppose that f has a minimum at Xo 
and we start at a point x. Then we look for a minimum of f closest to x along the straight 
line in the direction of —Vf(x), which is the direction of steepest descent (= direction 
of maximum decrease) of f at x. That is, we determine the value of ¢ and the correspond- 
ing point 


(2) z(t) = x — tVf(x) 


at which the function 


(3) g(t) = f(z(t)) 
has a minimum. We take this z(f) as our next approximation to Xo. 


Method of Steepest Descent 
Determine a minimum of 
(4) f(x) = x? + 3x8, 


starting from x9 = (6, 3) = 6i + 3j and applying the method of steepest descent. 


Solution. Clearly, inspection shows that f(x) has a minimum at 0. Knowing the solution gives us a better 
feel of how the method works. We obtain Vf(x) = 2x i + 6x2j and from this 


u(t) = x — tVf(x) = (1 — 2)xi + (1 — 6fxoj 


g(t) = f(a) = (1 — 20?x7 + 31 — 69)?x3. 
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We now calculate the derivative 


g(t) = 21 — 2x9(—2) + 6(1 — 61)x3(—-6), 
set g(t) = 0, and solve for ¢, finding 


2 
xy 9x5 


2x? + 54x23 


Starting from x9 = 6i + 3j, we compute the values in Table 22.1, which are shown in Fig. 473. 

Figure 473 suggests that in the case of slimmer ellipses (“a long narrow valley”), convergence would be 
poor. You may confirm this by replacing the coefficient 3 in (4) with a large coefficient. For more sophisticated 
descent and other methods, some of them also applicable to vector functions of vector variables, we refer to the 
references listed in Part F of App. 1; see also [E25]. Bi 


Fig. 473. Method of steepest descent in Example 1 


Table 22.1 Method of Steepest Descent, Computations in Example 1 


n x t iL = ey il = @ 
0 6.000 3.000 0.210 0.581 —0.258 
1 3.484 —0.774 0.310 0.381 —0.857 
2 1.327 0.664 0.210 0.581 —0.258 
3 0.771 —0.171 0.310 0.381 —0.857 
4 0.294 0.147 0.210 0.581 —0.258 
5 0.170 —0.038 0.310 0.381 —0.857 
6 0.065 0.032 


PROBLEM SET 22-71 


1. Orthogonality. Show that in Example 1, successive 5. f(x) =ax, + bxo, a#0,b#0. First guess, then 


gradients are orthogonal (perpendicular). Why? compute. 
2. What happens if you apply the method of steepest 6. f(x) = x7 — x8, xo = (1,2), 5 steps. First guess, 
descent to f(x) = xt + x3? First guess, then calculate. then compute. Sketch the path. What if xq = (2, 1)? 
7. f(x) = x3 ar cx3, Xo = (c, 1). Show that 2 steps give 
3-9| STEEPEST DESCENT (c, 1) times a factor, —4¢?/(c? — 1)”. What can you 
Do steepest descent steps when: conclude from this about the speed of convergence? 
3. f(x) = 2x4 + x3 — 4x1 + 4x2, x9 = 0, 3 steps 8. f(x) = x4 — X9, Xo = (1, 1); 3 steps. Sketch your path. 
4. f(x) = x7 + 0.5x3 — 5.0x1 — 3.0x2 + 24.95, Predict the outcome of further steps. 


xo = (3,4), 5 steps 9. f(x) = 0.1x? + x2 — 0.02x1, x9 = (3,3), 5 steps 
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10. CAS EXPERIMENT. Steepest Descent. (a) Write a (c) Apply your program to f(x) = x? + x§ and to 
program for the method. f(x) = xf + x§, xo = (2, 1). Graph level curves and 
(b) Apply your program to f(x) = x7 + 4x3, exper- your path of descent. (Try to include graphing directly 
imenting with respect to speed of convergence depending in your program.) 


on the choice of xo. 


22.2 Linear Programming 


EXAMPLE 1 


Linear programming or linear optimization consists of methods for solving optimization 
problems with constraints, that is, methods for finding a maximum (or a minimum) 
X = (44,°°*,Xy) of a linear objective function 


Z = f(X) = ayxy + doxg + +++ + ayxn 


satisfying the constraints. The latter are linear inequalities, such as 3x, + 4x2 = 36, or 
x, = 0, etc. (examples below). Problems of this kind arise frequently, almost daily, for 
instance, in production, inventory management, bond trading, operation of power plants, 
routing delivery vehicles, airplane scheduling, and so on. Progress in computer technology 
has made it possible to solve programming problems involving hundreds or thousands or 
more variables. Let us explain the setting of a linear programming problem and the idea 
of a “geometric” solution, so that we shall see what is going on. 


Production Plan 


Energy Savers, Inc., produces heaters of types S and L. The wholesale price is $40 per heater for § and $88 for 
L. Two time constraints result from the use of two machines My and Mg. On M, one needs 2 min for an S heater 
and 8 min for an L heater. On Mg one needs 5 min for an S heater and 2 min for an L heater. Determine production 
figures x, and xg for S and L, respectively (number of heaters produced per hour), so that the hourly revenue 


z= f(x) = 40xy + 88x92 


is maximum. 


Solution. Production figures x, and x2 must be nonnegative. Hence the objective function (to be maximized) 
and the four constraints are 


(0) z = 40x, + 88x5 

(1) 2x, + 8xg S 60 min time on machine My 
(2) 5x, + 2x2 S 60 min time on machine Mo 
(3) x4 2.0 

(4) x22 0. 


Figure 474 shows (0)—-(4) as follows. Constancy lines 
Z = const 


are marked (0). These are lines of constant revenue. Their slope is —40/88 = —5/11. To increase z we must 
move the line upward (parallel to itself), as the arrow shows. Equation (1) with the equality sign is marked (1). 
It intersects the coordinate axes at x; = 60/2 = 30 (set xg = 0) and xg = 60/8 = 7.5 (set xy = 0). The arrow 
marks the side on which the points (x, xg) lie that satisfy the inequality in (1). Similarly for Eqs. (2)-(4). The 
blue quadrangle thus obtained is called the feasibility region. It is the set of all feasible solutions, meaning 
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solutions that satisfy all four constraints. The figure also lists the revenue at O, A, B, C. The optimal solution 
is obtained by moving the line of constant revenue up as much as possible without leaving the feasibility region 
completely. Obviously, this optimum is reached when that line passes through B, the intersection (10, 5) of (1) 
and (2). We see that the optimal revenue 


Zmax = 40+ 10 + 88-5 = $840 


is obtained by producing twice as many S heaters as L heaters. ia] 


z=0 
:z2=40-12=480 

: 2=40-10+88-5=840 
: 2=88-7.5=660 


O: 
A 
B 
Cc 


), +S ©). 
May S849 


Fig. 474. Linear programming in Example 1 


Note well that the problem in Example | or similar optimization problems cannot be 
solved by setting certain partial derivatives equal to zero, because crucial to such problems 
is the region in which the control variables are allowed to vary. 

Furthermore, our “geometric” or graphic method illustrated in Example | is confined 
to two variables x1, x2. However, most practical problems involve much more than two 
variables, so that we need other methods of solution. 


Normal Form of a Linear Programming Problem 


To prepare for general solution methods, we show that constraints can be written more 
uniformly. Let us explain the idea in terms of (1), 


2x1 + 8x = 60. 
This inequality implies 60 — 2x, — 8xy = O (and conversely), that is, the quantity 
X3 = 60 — 2x1 — 8xe 
is nonnegative. Hence, our original inequality can now be written as an equation 
2x1 + 8x + xg = 60, 
where 


x3 = 0. 
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EXAMPLE 2 
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x3 is a nonnegative auxiliary variable introduced for converting inequalities to equations. 
Such a variable is called a slack variable, because it “takes up the slack” or difference 
between the two sides of the inequality. 


Conversion of Inequalities by the Use of Slack Variables 


With the help of two slack variables x3, x4 we can write the linear programming problem in Example | in the 
following form. Maximize 


f = 40x1 + 88x92 


subject to the constraints 


2x1 + 8xo + x3 = 60 
5x1 + 2x2 +xq = 60 
x; 20 (i = 1,---, 4). 
We now have n = 4 variables and m = 2 (linearly independent) equations, so that two of the four variables, 


for example, x1, x2, determine the others. Also note that each of the four sides of the quadrangle in Fig. 474 
now has an equation of the form x; = 0: 


OA: x2 = 0, 
AB: x4 = 0, 
BC: x3 = 0, 
CO: x, = 0, 


A vertex of the quadrangle is the intersection of two sides. Hence at a vertex, n — m = 4 — 2 = 2 of the 
variables are zero and the others are nonnegative. Thus at A we have x2 = 0, x4 = 0, and so on. i 


Our example suggests that a general linear optimization problem can be brought to the 
following normal form. Maximize 


(5) i = GO ae Gee) ar 8S a ee 
subject to the constraints 


41X11 qp peo ap AinXn = by 


dgjX1 + +++ + denXn = be 


(6) 
Os bet IP POO SP Clear stn = Don 
ye) (= IhPo? 70) 
with all b; nonnegative. (If a b; < 0, multiply the equation by —1.) Here x,---, x, include 


the slack variables (for which the c;’s in f are zero). We assume that the equations in (6) 
are linearly independent. Then, if we choose values for n — m of the variables, the system 
uniquely determines the others. Of course, since we must have 


this choice is not entirely free. 
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Our problem also includes the minimization of an objective function f since this 
corresponds to maximizing —f and thus needs no separate consideration. 

An n-tuple (x1,°*+,*,,) that satisfies all the constraints in (6) is called a feasible point 
or feasible solution. A feasible solution is called an optimal solution if, for it, the objective 
function f becomes maximum, compared with the values of f at all feasible solutions. 

Finally, by a basic feasible solution we mean a feasible solution for which at least 
n — m of the variables x1,---,x, are zero. For instance, in Example 2 we have n = 4, 
m = 2, and the basic feasible solutions are the four vertices O, A, B, C in Fig. 474. Here 
B is an optimal solution (the only one in this example). 

The following theorem is fundamental. 


THEOREM -1 Optimal Solution 


Some optimal solution of a linear programming problem (5), (6) is also a basic 


feasible solution of (5), (6). 


For a proof, see Ref. [F5], Chap. 3 (listed in App. 1). A problem can have many optimal 
solutions and not all of them may be basic feasible solutions; but the theorem guarantees 
that we can find an optimal solution by searching through the basic feasible solutions 


n n 
only. This is a great simplification; but since there are ( ) = ( ) different ways 
n-m m 


of equating n — m of the n variables to zero, considering all these possibilities, dropping 
those which are not feasible and then searching through the rest would still involve very 
much work, even when v and m are relatively small. Hence a systematic search is needed. 
We shall explain an important method of this type in the next section. 


PROBLEEM—SET 22-2 


1-6| REGIONS, CONSTRAINTS ae Say eg 


Describe and graph the regions in the first quadrant of xXy+x25 5 
the x 4x -plane determined by the given inequalities. 


0 


IV 


1. X1 — 3x9 2-6 


6. Xy,T XQ a 2 


xi + x25 6 Su Sty 15 


2: 2x41 = X29 = 6 


2x 1” «2 22 
8x, + 10xg = 80 —x,; + 2xe = 10 
X1— 2X, 2-3 7. Location of maximum. Could we find a_ profit 
= + doX2 whose maximum is at an 
3, —0.5x, + <? F(%1, X2) 2X1 + aXe naxin 
us : interior point of the quadrangle in Fig. 474? Give 
Xy + Xg 22 reason for your answer. 
—x, + 5x9 25 8. Slack variables. Why are slack variables always 
nonnegative? How many of them do we need? 
4. -x ; ti x2 = Re) i : : : 
9. What is the meaning of the slack variables x3, x4 in 
2x1 + x22 10 Example 2 in terms of the problem in Example 1? 
xe92 4 10. Uniqueness. Can we always expect a unique solution 


i 2 
10x, + 15x <= 150 (as in Example 1)? 
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11-16 


MAXIMIZATION, MINIMIZATION 


Maximize or minimize the given objective function f 
subject to the given constraints. 


11. 
12. 
13. 
14. 
15. 


16. 


17. 


18. 


Maximize f = 30x; + 10xg in the region in Prob. 5. 
Minimize f = 45.0x1 + 22.5xg in the region in Prob. 4. 
Maximize f = 5x 1 + 25x in the region in Prob. 5. 
Minimize f = 5x1 + 25xg in the region in Prob. 3. 
Maximize f = 20x; + 30x subject to 4x1 + 3x9 
12, Xy — X22 —3, Xo =6, 2x1 — 3x9 S0. 
Maximize f= —10x, + 2x9 to x, 20, 
Xo = 0, X1 t X= 11 Xg S5. 
Maximum profit. United Metal, Inc., produces alloys 
By (special brass) and By (yellow tombac). B, contains 
50% copper and 50% zinc. (Ordinary brass contains 
about 65% copper and 35% zinc.) Bg contains 75% 
copper and 25% zinc. Net profits are $120 per ton of 
B, and $100 per ton of Bg. The daily copper supply is 
45 tons. The daily zinc supply is 30 tons. Maximize 
the net profit of the daily production. 


> 


subject 
XY t x9 = 6, 


Maximum profit. The DC Drug Company produces 
two types of liquid pain killer, N (normal) and S$ 
(Super). Each bottle of N requires 2 units of drug A, 1 
unit of drug B, and 1 unit of drug C. Each bottle of S 
requires | unit of A, 1 unit of B, and 3 units of C. The 
company is able to produce, each week, only 1400 units 
of A, 800 units of B, and 1800 units of C. The profit 
per bottle of N and S is $11 and $15, respectively. 
Maximize the total profit. 


22.3 Simplex Method 


From the last section we recall the following. A linear optimization problem (linear 
programming problem) can be written in normal form; that is: 


19. 


20. 


21. 


22. 


Maximize 


(1) 


2 = iC) = Gein ar ee 


Linear Programming 


Maximum output. Giant Ladders, Inc., wants to 
maximize its daily total output of large step ladders by 
producing x; of them by a process P, and xg by a 
process Py, where P;, requires 2 hours of labor and 
4 machine hours per ladder, and P, requires 3 hours of 
labor and 2 machine hours. For this kind of work, 1200 
hours of labor and 1600 hours on the machines are, at 
most, available per day. Find the optimal x1 and x9. 


Minimum cost. Hardbrick, Inc., has two kilns. Kiln 
I can produce 3000 gray bricks, 2000 red bricks, and 
300 glazed bricks daily. For Kiln II the corresponding 
figures are 2000, 5000, and 1500. Daily operating costs 
of Kilns I and II are $400 and $600, respectively. Find 
the number of days of operation of each kiln so that 
the operation cost in filling an order of 18,000 gray, 
34,000 red, and 9000 glazed bricks is minimized. 


Maximum profit. Universal Electric, Inc., manufactures 
and sells two models of lamps, L and Lg, the profit being 
$150 and $100, respectively. The process involves two 
workers W, and Ws who are available for this kind 
of work 100 and 80 hours per month, respectively. 
W, assembles L in 20 min and Ly in 30 min. Ws paints 
Ly, in 20 min and Lg in 10 min. Assuming that all lamps 
made can be sold without difficulty, determine production 
figures that maximize the profit. 

Nutrition. Foods A and B have 600 and 500 calories, 
contain 15 g and 30 g of protein, and cost $1.80 and $2.10 
per unit, respectively. Find the minimum cost diet of at 
least 3900 calories containing at least 150 g of protein. 


aI (Graben 


subject to the constraints 


Ghee) a 2° ar Chen = 


by 


dg3X1 + +++ + danXn = be 


(2) eee 


Gy ata eats yy Gy 


aa) 


ee 


ee 


ec ec r eee ee ee 
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For finding an optimal solution of this problem, we need to consider only the basic feasible 
solutions (defined in Sec. 22.2), but there are still so many that we have to follow a 
systematic search procedure. In 1948 G. B. Dantzig’ published an iterative method, called 
the simplex method, for that purpose. In this method, one proceeds stepwise from one 
basic feasible solution to another in such a way that the objective function f always 
increases its value. Let us explain this method in terms of the example in the last section. 

In its original form the problem concerned the maximization of the objective function 


z= 40x, + 88x 


2x41 = 8x2 


IIA 


60 


5x1 + 2x9, = 60 
subject to 
X1 = 0 


XQ = 0. 
Converting the first two inequalities to equations by introducing two slack variables x3, x4, 


we obtained the normal form of the problem in Example 2. Together with the objective 
function (written as an equation z — 40x, — 88x2 = 0) this normal form is 


z — 40x; — 88x2 = 0 
(3) 2x4 + 8xo+ X3 = 60 
5x41 + 2x9 + x4 = 60 


where x; = 0,-+-, x4 = 0. This is a linear system of equations. To find an optimal solution 
of it, we may consider its augmented matrix (see Sec. 7.3) 


Ve xy Xo xg X4 b 

1'-40 -88 '!0 0! O 
al I [- 

(4) T.=| 0; 2 | 0 | 60 
| | | 

0; 5 2;0 #1) 60 


1GEORGE BERNARD DANTZIG (1914-2005), American mathematician, who is one of the pioneers of 
linear programming and inventor of the simplex method. According to Dantzig himself (see G. B. Dantzig, 
Linear programming: The story of how it began, in J. K. Lenestra et al., History of Mathematical Programming: 
A Collection of Personal Reminiscences. Amsterdam: Elsevier, 1991, pp. 19-31), he was particularly fascinated 
by Wassilly Leontief’s input-output model (Sec. 8.2) and invented his famous method to solve large-scale 
planning (logistics) problems. Besides Leontief, Dantzig credits others for their pioneering work in linear 
programming, that is, JOHN VON NEUMANN (1903-1957), Hungarian American mathematician, Institute for 
Advanced Studies, Princeton University, who made major contributions to game theory, computer science, 
functional analysis, set theory, quantum mechanics, ergodic theory, and other areas, the Nobel laureates LEONID 
VITALIYEVICH KANTOROVICH (1912-1986), Russian economist, and TJALLING CHARLES 
KOOPMANS (1910-1985), Dutch-American economist, who shared the 1975 Nobel Prize in Economics for 
their contributions to the theory of optimal allocation of resources. Dantzig was a driving force in establishing 
the field of linear programming and became professor of transportation sciences, operations research, and 
computer science at Stanford University. For his work see R. W. Cottle (ed.), The Basic George B. Dantzig. 
Palo Alto, CA: Stanford University Press, 2003. 
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This matrix is called a simplex tableau or simplex table (the initial simplex table). These 
are standard names. The dashed lines and the letters 


Z X1, uy b 


are for ease in further manipulation. 

Every simplex table contains two kinds of variables x;. By basic variables we mean 
those whose columns have only one nonzero entry. Thus x3, x4 in (4) are basic variables 
and x1, X» are nonbasic variables. 

Every simplex table gives a basic feasible solution. It is obtained by setting the nonbasic 
variables to zero. Thus (4) gives the basic feasible solution 


x, = 0, Xo = 0, x3 = 60/1 = 60, x4 = 60/1 = 60, z=0 


with x3 obtained from the second row and x4 from the third. 

The optimal solution (its location and value) is now obtained stepwise by pivoting, 
designed to take us to basic feasible solutions with higher and higher values of z until the 
maximum of z is reached. Here, the choice of the pivot equation and pivot are quite 
different from that in the Gauss elimination. The reason is that x1, x2, x3, x4 are restricted 
to nonnegative values. 


Step 1. Operation O,: Selection of the Column of the Pivot 
Select as the column of the pivot the first column with a negative entry in Row |. In (4) 
this is Column 2 (because of the —40). 


Operation Og: Selection of the Row of the Pivot. Divide the right sides [60 and 60 in 
(4)] by the corresponding entries of the column just selected (60/2 = 30, 60/5 = 12). 
Take as the pivot equation the equation that gives the smallest quotient. Thus the pivot 
is 5 because 60/5 is smallest. 


Operation O3: Elimination by Row Operations. This gives zeros above and below the 
pivot (as in Gauss—Jordan, Sec. 7.8). 


With the notation for row operations as introduced in Sec. 7.3, the calculations in Step 1 
give from the simplex table Tg in (4) the following simplex table (augmented matrix), 
with the blue letters referring to the previous table. 


Zz Xy Xo X3 X4 b 
be) Oh =e |. o 8 | 480 Row 1 + 8 Row 3 
T T T 
(5) T,=|0 |; 0 72 | 1 -04 | 36 Row 2 — 0.4 Row 3 
| | | 
0:5 2 10 +1 +t 60 


We see that basic variables are now x1, x3 and nonbasic variables are x9, x4. Setting the 
latter to zero, we obtain the basic feasible solution given by Ty, 


x, =60/5=12, x2.=0, x3 =36/1=36, x4g=0, z= 480. 


This is A in Fig. 474 (Sec. 22.2). We thus have moved from O: (0,0) with z = 0 to 
A: (12, 0) with the greater z = 480. The reason for this increase is our elimination of a 


SEC. 22.3 Simplex Method 961 


term (—40x ) with a negative coefficient. Hence elimination is applied only to negative 
entries in Row | but to no others. This motivates the selection of the column of the pivot. 

We now motivate the selection of the row of the pivot. Had we taken the second row 
of To instead (thus 2 as the pivot), we would have obtained z = 1200 (verify!), but this 
line of constant revenue z = 1200 lies entirely outside the feasibility region in Fig. 474. 
This motivates our cautious choice of the entry 5 as our pivot because it gave the smallest 
quotient (60/5 = 12). 


Step 2. The basic feasible solution given by (5) is not yet optimal because of the negative 
entry —72 in Row |. Accordingly, we perform the operations O, to Og again, choosing a 
pivot in the column of —72. 


Operation O,. Select Column 3 of T, in (5) as the column of the pivot (because —72 < 0). 


Operation Og. We have 36/7.2 = 5 and 60/2 = 30. Select 7.2 as the pivot (because 
5 < 30). 


Operation O3. Elimination by row operations gives 


Zz Xy Xe X3 X4 b 

Le oo 10 4 | 840 | Row 1 + 10 Row 2 
I- + 4 

(6) T;=| 040 721 1 -04 | 36 

| | | 
l 1 1 | 2 

0; 5 0 | -~=z = | +50 Row 3 — =~ Row 2 
| 3.6 0.9 1 hie 


We see that now x4, x9 are basic and x3, x4 nonbasic. Setting the latter to zero, we obtain 
from Ty the basic feasible solution 


x1 =50/5=10, x9 =36/7.2=5, x3=0, x4=0, z= 840. 


This is B in Fig. 474 (Sec. 22.2). In this step, z has increased from 480 to 840, due to the 
elimination of —72 in T;. Since Ty contains no more negative entries in Row 1, we 
conclude that z = f(10,5) = 40- 10 + 88 - 5 = 840 is the maximum possible revenue. 
It is obtained if we produce twice as many S heaters as L heaters. This is the solution of 
our problem by the simplex method of linear programming. a 


Minimization. If we want to minimize z = f(x) (instead of maximize), we take as the 
columns of the pivots those whose entry in Row | is positive (instead of negative). In 
such a Column k we consider only positive entries f,, and take as pivot a t;, for which 
b;/t;,, is smallest (as before). For examples, see the problem set. 


PROBLEM SET 2273 


1. Verify the calculations in Example 1 of the text. 2. The problem in the example in the text with the 


2-14 


SIMPLEX METHOD 


constraints interchanged. 
3. Maximize f = 3x1 + 2x2 subject to 3xy + 4xo S 60, 


Write in normal form and solve by the simplex method, 4x, + 3x9 = 60, 10x, + 2x2 = 120. 
assuming all x; to be nonnegative. 
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4. Maximize the daily output in producing x; chairs by 10. Minimize f= 4x; — 10xg — 20x3 subject to 3x, + 


Process P, and xg chairs by Process P, subject to 4x5 + 5x3 S 60, 2x1 + xq S 20, 2x, + 3x3 S 30. 

3x1 + 4x2 = 550 (machine hours), 5x1 + 4x2 = 650 11. Prob. 22 in Problem Set 22.2. 

(labor). 12. Maximize f = 2x1 + 3xg + xg subject to xy + x2 + 
5. Minimize f = 5x, — 20xg subject to —2x, + 10x2 x3 = 4.8, 10x, + x3 S 9.9, x5 — x3 = 0.2. 

= 5, ia + 5xq = 10. 13. Maximize f = 34x; + 29xg + 32x3 subject to 8x, + 
6. Prob. 19 in Sec. 22.2. Qxg + xg S54, 3x1 + 8xq + 2xg S59, xy + xQ+ 
7. Suppose we produce x, AA batteries by Process 5x3 = 39. 

P, and xg by Process P,, furthermore x3 A batteries by 14. Maximize f = 2x, + 3x2 subject to 5x; + 3x2 S 105, 

Process P3 and x4 by Process P,. Let the profit for 100 3x1 + 6x2 S 126. 


batteries be $10 for AA and $20 for A. Maximize the 15 


: : . . CAS PROJECT. Simple Method. (a) Write a program 
total profit subject to the constraints 


for graphing a region R in the first quadrant of the 


12x, + 8xo 6x3 4x4 = 120 (Material) X4Xg-plane determined by linear constraints. 
3x1 + 6xq + 12x3 + 24x4 = 180 (Labor). (b) Write a program for maximizing z = ayx1 + dox2 

8. Maximize the daily profit in producing x; metal frames in R. 

F (profit $90 per frame) and x2 frames Fy (profit $50 (c) Write a program for maximizing z= a,x, + 

per frame) subject to x, + 3x2 S18 (material), “++ + GyXn Subject to linear constraints. 

x1 + xg = 10 (machine hours), 3x1 + x2 S 24 (labor). (d) Apply your programs to problems in this problem 
9. Maximize f = 2x, + x2 + 3xg subject to 4x, + 3xg + set and the previous one. 

6x3 = 12. 


22.4 Simplex Method: Difficulties 


In solving a linear optimization problem by the simplex method, we proceed stepwise 
from one basic feasible solution to another. By so doing, we increase the value of the 
objective function f. We continue this stepwise procedure, until we reach an optimal 
solution. This was all explained in Sec. 22.3. However, the method does not always proceed 
so smoothly. Occasionally, but rather infrequently in practice, we encounter two kinds of 
difficulties. The first one is the degeneracy and the second one concerns difficulties in 
starting. 


Degeneracy 


A degenerate feasible solution is a feasible solution at which more than the usual number 
n — m of variables are zero. Here n is the number of variables (slack and others) and m 
the number of constraints (not counting the x; = 0 conditions). In the last section, n = 4 
and m = 2, and the occurring basic feasible solutions were nondegenerate; n — m = 2 
variables were zero in each such solution. 

In the case of a degenerate feasible solution we do an extra elimination step in which 
a basic variable that is zero for that solution becomes nonbasic (and a nonbasic variable 
becomes basic instead). We explain this in a typical case. For more complicated cases 
and techniques (rarely needed in practice) see Ref. [F5] in App. 1. 


EXAMPLE 1_ Simplex Method, Degenerate Feasible Solution 


AB Steel, Inc., produces two kinds of iron 4, /z by using three kinds of raw material Ry, Ro, Rg (scrap iron and 
two kinds of ore) as shown. Maximize the daily profit. 
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Raw Material Needed 
Raw per Ton Raw Material Available 
Material per Day (tons) 
Tron /; Tron [5 

Ry 2 1 16 

Ro 1 1 8 

R3 0 1 3.5 
Net profit 

$150 $300 
per ton 
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Solution. Let x1 and xz denote the amount (in tons) of iron J; and Jp, respectively, produced per day. Then 


our problem is as follows. Maximize 


(1) z =f(x) = 150x1 + 300x2 


subject to the constraints xy = 0, x2 = O and 


2x1 + xo S16 (raw material R1) 
xXy tx 8 (raw material Ro) 


xg = 3.5 (raw material R3). 


By introducing slack variables x3, x4, x5 we obtain the normal form of the constraints 


2x, + xg + x3 = 16 
(2) xy +X2 + x4 = 8 
x2 +x5 = 3.5 
x, 20 (Gi = 1,--°, 5). 


As in the last section we obtain from (1) and (2) the initial simplex table 


Zz xy Xp x3 X4 X5 b 
1 !' —150 -—300 ! 0 0 0! 0 
4 4 L 
| | | 
0 | 2 1 11 0 0 | 16 
(3) To = | | | 
G7; 4 1 | 0 1 0} 8 
| | | 
0 | 0 1 10 0 11 35 


We see that x1, x are nonbasic variables and x3, x4, 5 are basic. With xy = xg = O we have from (3) the basic 


feasible solution 


m=0, x2=0, x3=16/1=16, x=8/1=8, x5 = 3.5/1 =3.5, 


This is O:(0, 0) in Fig. 475. We have n = 5 variables x;,m = 3 constraints, and n — m = 2 variables equal to 


zero in our solution, which thus is nondegenerate. 


Step 1 of Pivoting 


Operation Oy: Column Selection of Pivot. Column 2 (since —150 < 0). 


Operation Oz: Row Selection of Pivot. 16/2 = 8, 8/1 = 8; 3.5/0 is not possible. Hence we could choose 


Row 2 or Row 3. We choose Row 2. The pivot is 2. 
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Operation Og: Elimination by Row Operations. This gives the simplex table 


z Xy 9 Xp x3 x4 X5 b 
1 4 0 —225 i 75 0 0 | 1200 Row | + 75 Row 2 
o | 2 Tt ve i 0 oO; 16 
(4) T, = l 
0 0 3 -k 1 0 0 Row 3 — 3 Row 2 
| | | 
0 1 0 1 | 0 0 Hit 335 Row 4 


We see that the basic variables are x1, x4, X5 and the nonbasic are x9, x3. Setting the nonbasic variables to zero, 
we obtain from Tj the basic feasible solution 


Fig. 475. Example 1, where A is degenerate 


x, = 16/2=8, x2=0, x3=0, x4=0/1=0, x5 =3.5/1=3.5, z= 1200. 


This is A: (8,0) in Fig. 475. This solution in degenerate because x4 = 0 (in addition to xy = 0,x3 = 0); 
geometrically: the straight line x4 = 0 also passes through A. This requires the next step, in which x4 will 
become nonbasic. 


Step 2 of Pivoting 
Operation O: Column Selection of Pivot. Column 3 (since —225 < 0). 
Operation Oz: Row Selection of Pivot. 16/1 = 16, 0/5 = 0. Hence 5 must serve as the pivot. 


Operation Og: Elimination by Row Operations. This gives the following simplex table. 


Zz xy Xo, x3 X4 X5 b 

110 01-150 450 0 ! 1200 7] Row I + 450 Row 3 

o}-2 | 2 3 01 16 | Row2-—2Row3 
oN at od 

010 bl -4 1 0! 0 

Cre oO) tf = 1 | 35] Row4—2Row3 


We see that the basic variables are x1, x2, x5 and the nonbasic are x3, x4. Hence x4 has become nonbasic, as 
intended. By equating the nonbasic variables to zero we obtain from Tz the basic feasible solution 


xy =16/2=8, x.=0/=0, x3=0, x4=0, 2x5=35/1=35, z= 1200. 


This is still A: (8, 0) in Fig. 475 and z has not increased. But this opens the way to the maximum, which we 
reach in the next step. 


SEC. 22.4 Simplex Method: _ Difficulties 965 


EXAMPLE 2 


Step 3 of Pivoting 
Operation O;: Column Selection of Pivot. Column 4 (since —150 < 0). 


Operation Oz: Row Selection of Pivot. 16/2 = 8, 0/(—3) = 0, 3.5/1 = 3.5. We can take | as the pivot. 
(With i as the pivot we would not leave A. Try it.) 


Operation Os: Elimination by Row Operations. This gives the simplex table 


Zz Xx Xp x3 x4 x5 b 
110 0 !0 150 150 ! 1725 Row 1 + 150 Row 4 
4 b b 
0;2 oO; 0 a a: Row 2 — 2 Row 4 
(6) Tz = | | | 
010 3 LO 0 3 | 1.75 | Row3 +4Row4 
6, Opi 8 i 


We see that basic variables are x1, x2, x3 and nonbasic x4, x5. Equating the latter to zero we obtain from T3 the 
basic feasible solution 


x, = 9/2 = 45, x2 = 1.75/% = 3.5, x3 = 3.5/1 = 3.5, x4 = 0, x5 = 0, z= 1725. 
This is B: (4.5, 3.5) in Fig. 475. Since Row | of Ts has no negative entries, we have reached the maximum daily 


profit Zmax = f(4.5, 3.5) = 150 + 4.5 + 300 + 3.5 = $1725. This is obtained by using 4.5 tons of iron J; and 
3.5 tons of iron Jo. i | 


Difficulties in Starting 


As a second kind of difficulty, it may sometimes be hard to find a basic feasible solution 
to start from. In such a case the idea of an artificial variable (or several such variables) 
is helpful. We explain this method in terms of a typical example. 


Simplex Method: Difficult Start, Artificial Variable 
Maximize 

(7) z= f(x) = 2x1 + x2 
subject to the constraints x7 2 0, x2 = 0 and (Fig. 476) 


xy — 3x2 21 
x1 oS 2 


xy t+ x9 S4. 


Solution. By means of slack variables we achieve the normal form of the constraints 


Z—2x,- Xe =0 

X1 — $xXq — x3 = 1 

(8) Xx, X2 + X4 =2 
Xy+ XQ +x5=4 
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Note that the first slack variable is negative (or zero), which makes x3 nonnegative within the feasibility region 
(and negative outside). From (7) and (8) we obtain the simplex table 


Zz xy Xo x3 X4g X5 b 
1 We =? -1 |! 0 0 Oo ! 0 
a L 4 
ot wt |} at oo eT 4 
| ae | 
0 1 —1 ; 0 1 0 2 
co); kt Lp * Le 


X41, X are nonbasic, and we would like to take x3, x4, x5 as basic variables. By our usual process of equating 
the nonbasic variables to zero we obtain from this table 


x4 =0, x2=0, xg=I/(-D=-l, xwga=7=2, x5=$=4, 2=0. 
X3 <0 indicates that (0, 0) lies outside the feasibility region. Since x3 < 0, we cannot proceed immediately. 
Now, instead of searching for other basic variables, we use the following idea. Solving the second equation in 
(8) for x3, we have 


xg=—-ltx —4xo. 


To this we now add a variable xg on the right, 


Fig. 476. Feasibility region in Example 2 


(9) x3 =—-l+x4- dx + x6. 


Xg is called an artificial variable and is subject to the constraint xg = 0. 

We must take care that xg (which is not part of the given problem!) will disappear eventually. We shall see 
that we can accomplish this by adding a term —Mxg with very large M to the objective function. Because of 
(7) and (9) (solved for xg) this gives the modified objective function for this “extended problem” 


(10) £=z7— Mxg = 2x1 + x2 — Mxg = (2 + Mx, + (1 — $M)xe — Mx — M. 


We see that the simplex table corresponding to (10) and (8) is 


Zz xy Xo XxX X4 X5 x6 b 
1;-2-M -1+iM) M 0 0 0; -M 
oO; 1 1 | -l 0 0 oO; 1 
2 
| | | 
To=| 01 1 -1 1 0 1 0 01 2 
| | | 
a; 4 1 | 0 0 1 0; 4 
| | | 
01 1 -3 1 -l 0 0 11 ot 
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The last row of this table results from (9) written as x, — dx — x3 + xg = 1. We see that we can now start, 
taking x4, X5, %g as the basic variables and x1, xg, x3 as the nonbasic variables. Column 2 has a negative first 
entry. We can take the second entry (1 in Row 2) as the pivot. This gives 


Zz xy Xp, x3 x4 X5 X6 b 

1 ! 0 2) il 22 0 0 Oo ! 2 
4+ 4 + 
| | | 

Oo, 1 +, 71 0 0 oO , 1 
| | I 

T=} 0 1 O -$ 1 1 1 0 Oo 1 1 
| | | 
| 3s) | 

0 ; 0 2 | 1 0 1 0 ; 3 
| | | 

0 1 0 0 | 0 0 0 1 1 0 

This corresponds to x1 = 1, x2 = 0 (point A in Fig. 476), x3 = 0,x4 = 1,x5 = 3,xg = 0. We can now drop 


Row 5 and Column 7. In this way we get rid of xg, as wanted, and obtain 


Tz = 


In Column 3 we choose 3 as the next pivot. We obtain 


Zz xy Xo x3 X4 x5 b 
1! 0 -2 1 -2 0 0 1 2 
+ + | 
oj, 1 af -1 0 0; 1 
| | | 
| pe a | | 
0 | 0 ae. 1 0 ! 1 
0 1 0 ,) & * tL a3 
Zz xy Xp Xg X4 X5 b 
1 ! 0 0 1 _2 0 z 1 6 
| | 3 | 
| | | 
0 | 1 0 | -2 0 x | 2 
| | 4 i | 
O10 oO) 4 1 a 1.2 
| | | 
0 |; O 2) 1 0 1 | 3 


This corresponds to x; = 2, x2 = 2 (this is B in Fig. 476), x3 = 0,xq4 = 2,x5 = 0. In Column 4 we choose 3 


as the pivot, by the usual principle. This gives 


Zz xy Xo x3 (X4 X5 b 
1 ! O 0 | 3 2 1 7 
4 kb ¥ 7 Pe 
i | I 
0 ! 1 0 ! 0 3 x 3 
I | 4 2. ll 
0 ! 0 0 ! 3 1 3 ! 2 
oO 1 9 310 -$ Gi 3 


This corresponds to x, = 3,x2g = 1 (point C in Fig. 476), x3 = 3x4 = 0,x5 = 0. This is the maximum 


fax = f(3, 1) = 7. 


We have reached the end of our discussion on linear programming. We have presented 
the simplex method in great detail as this method has many beautiful applications and 
works well on most practical problems. Indeed, problems of optimization appear in civil 
engineering, chemical engineering, environmental engineering, management science, 
logistics, strategic planning, operations management, industrial engineering, finance, and 
other areas. Furthermore, the simplex method allows your problem to be scaled up from 
a small modeling attempt to a larger modeling attempt, by adding more constraints and 
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variables, thereby making your model more realistic. The area of optimization is an active 
field of development and research and optimization methods, besides the simplex method, 
are being explored and experimented with. 


PROBLEM SET 22-4 


1. 


Maximize z = f(x) = 7x1 + 14x» subject to 0 Sy 
=6,0 Sxeq S3, 7x1, + 14x29 S 84. 


. Do Prob. 1 with the last two constraints interchanged. 


. Maximize the daily output in producing x steel sheets 


by process P, and xg steel sheets by process Pg subject 
to the constraints of labor hours, machine hours, and 
raw material supply: 


3x4 + 2x9 — 180, 4x, am 6x9 = 200, 
5x41 ca 3x9 = 160. 


. Maximize z = 300x; + 500xg subject to 2x; + 8x2 


= 60, 2x4 bi a X92 = 30, 4x4 a 4x9 = 60. 


. Do Prob. 4 with the last two constraints interchanged. 


Comment on the resulting simplification. 


. Maximize the total output f= x1 + x2 + x3 (pro- 


duction from three distinct processes) subject to input 
constraints (limitation of time available for production) 


5x4 + 6x92 Shr 7x3 = 12; 
7X41 + 4x9 Tr Xg = 12; 
. Maximize f = 5x; + 8xg + 4x3 subject to x; 20 
G=1,::°,5) and xy +x3 4+ x5 = 1,xX2 + x3 
+ X4 =1. 


. Using an artificial variable, minimize f = 4x1 — x2 subject 


to xy + XxX» 2 2, —2xy + 3x_ S 1,5xy + 4x9 S 50. 


. Maximize f = 2x1 + 3x9 + 2x3, x1 = 0, x2 = 0, 


X3 = 0, x4 +P 2x9 amd 4x3 5 2X4 aia 2x9 =F 2x3 5 3: 


CHAPTER 22 REVIEW QUESTIONS AND PROBLEMS 


1. 


Cnmrran 


What is unconstrained optimization? Constraint optimiza- 
tion? To which one do methods of calculus apply? 


. State the idea and the formulas of the method of steepest 


descent. 


. Write down an algorithm for the method of steepest descent. 


. Design a “method of steepest ascent” for determining 


maxima. 


. What is the method of steepest descent for a function 


of a single variable? 


. What is the basic idea of linear programming? 

. What is an objective function? A feasible solution? 

. What are slack variables? Why did we introduce them? 
. What happens in Example | of Sec. 22.1 if you replace 


f(x) = x7 + 3x3 with f(x) = x7 + 5x3? Start from 


Xo = [6 3]'. Do 5 steps. Is the convergence faster or 
slower? 


. Apply the method of steepest descent to f(x) = 9x? + 


xe + 18x1 — 4x9, 5 steps. Start from xp = [2 4y". 


11. In Prob. 10, could you start from [0 0] and do 5 steps? 
12. Show that the gradients in Prob. 11 are orthogonal. Give 
a reason. 

13-16} Graph or sketch the region in the first quadrant 
of the x1x9-plane determined by the following inequalities. 
13. xy — 2x29 S -2 

0.8x3 + xoS 6 


14, eS 2x9 2-4 
2x41 T X92 = 12 
Xz, 7T XQ S 8 
15. X11 XQ Ss 5 
x9 = 3 
XxX, 1 XQ =) 
16. XY + X92 = 2 
2x41 _ 3x9 = =12 
Xy = 15 
17-20 | Maximize or minimize as indicated. 
17. Maximize f = 10x, + 20xg subject to x7 S5,2x, + 


18. 


19. 


20. 


X9 = 6, X92 S 4. 
Maximize f= x1 + xg subject to x1 + 2x2 = 10, 
2x9 + x92 = 10, x9 =4, 
Minimize f = 2x; — 10x» subject to x1 — x9 =4, 
2x, + xq = 14, xXy t+ x29, =x4 + 3xo'= 15. 
A factory produces two kinds of gaskets, G1, Go, with 
net profit of $60 and $30, respectively, Maximize the 
total daily profit subject to the constraints (xj; = number 
of gaskets G; produced per day): 
40x, + 40x2g = 1800 (Machine hours), 
200x1 + 20x = 6300 (Labor). 
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SUMMARY-OF-CHAPTER-2-2 


Unconstrained Optimization. Linear Programming 


In optimization problems we maximize or minimize an objective function z = f(x) 
depending on control variables x1,---,x,, whose domain is either unrestricted 
(“unconstrained optimization,” Sec. 22.1) or restricted by constraints in the form 
of inequalities or equations or both (“constrained optimization,” Sec. 22.2). 

If the objective function is /inear and the constraints are linear inequalities in 
X4,°**,Xm, then by introducing slack variables x,,.1,---,x, we can write the 
optimization problem in normal form with the objective function given by 


(1) fi = 1X1 +++ + CyXn, 
(where C41 = *** = Cy = 0) and the constraints given by 


Ay4X 1 + a42X92 tere t AnXtn = by 


(2) 
AmiX1 + AmaX2 + +** + AmnXn = bm 
= 03a, = 0; 
In this case we can then apply the widely used simplex method (Sec. 22.3), a 


systematic stepwise search through a very much reduced subset of all feasible 
solutions. Section 22.4 shows how to overcome difficulties with this method. 


CHAPTER 2 3 


Graphs. 
Combinatorial Optimization 


Many problems in electrical engineering, civil engineering, operations research, industrial 
engineering, management, logistics, marketing, and economics can be modeled by graphs 
and directed graphs, called digraphs. This is not surprising as they allow us to model 
networks, such as roads and cables, where the nodes may be cities or computers. The 
task then is to find the shortest path through the network or the best way to connect 
computers. Indeed, many researchers who made contributions to combinatorial 
optimization and graphs, and whose names lend themselves to fundamental algorithms 
in this chapter, such as Fulkerson, Kruskal, Moore, and Prim, all worked at Bell 
Laboratories in New Jersey, the major R&D facilities of the huge telephone and 
telecommunication company AT&T. As such, they were interested in methods of 
optimally building computer networks and telephone networks. The field has progressed 
into looking for more and more efficient algorithms for very large problems. 

Combinatorial optimization deals with optimization problems that are of a pronounced 
discrete or combinatorial nature. Often the problems are very large and so a direct search 
may not be possible. Just like in linear programming (Chap. 22), the computer is an 
indispensible tool and makes solving large-scale modeling problems possible. Because 
the area has a distinct flavor, different from ODEs, linear algebra, and other areas, we 
start with the basics and gradually introduce algorithms for shortest path problems (Secs. 
22.2, 22.3), shortest spanning trees (Secs. 23.4, 23.5), flow problems in networks (Secs. 
23.6, 23.7), and assignment problems (Sec. 23.8). 


Prerequisite: none. 
References and Answers to Problems: App. | Part F, App. 2. 


23.1 Graphs and Digraphs 
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Roughly, a graph consists of points, called vertices, and lines connecting them, called 
edges. For example, these may be four cities and five highways connecting them, as in 
Fig. 477. Or the points may represent some people, and we connect by an edge those who 
do business with each other. Or the vertices may represent computers in a network and 
the edge connections between them. Let us now give a formal definition. 
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DEFINITION 


Loop 


Isolated 
, vertex 


Double edge 
Fig. 477. Graph consisting of Fig. 478. Isolated vertex, loop, double 
4 vertices and 5 edges edge. (Excluded by definition.) 


Graph 


A graph G consists of two finite sets (sets having finitely many elements), a set V 
of points, called vertices, and a set E of connecting lines, called edges, such that 
each edge connects two vertices, called the endpoints of the edge. We write 


G = (V,E). 
Excluded are isolated vertices (vertices that are not endpoints of any edge), loops 


(edges whose endpoints coincide), and multiple edges (edges that have both 
endpoints in common). See Fig. 478. 


CAUTION! Our three exclusions are practical and widely accepted, but not uniformly. 
For instance, some authors permit multiple edges and call graphs without them simple 
graphs. ia 


We denote vertices by letters, u,vV,-+- Or Uz, V2,°** or simply by numbers 1, 2,--- (as 
in Fig. 477). We denote edges by ej, e2,::-or by their two endpoints; for instance, 
ey = (1, 4), eg = C1, 2) in Fig. 477. 

An edge (uj, Uj) is called incident with the vertex v; (and conversely); similarly, (v;, vj) 
is incident with v;. The number of edges incident with a vertex vu is called the degree of v. 
Two vertices are called adjacent in G if they are connected by an edge in G (that is, if they 
are the two endpoints of some edge in G). 

We meet graphs in different fields under different names: as “networks” in electrical 
engineering, “structures” in civil engineering, “molecular structures” in chemistry, 
“organizational structures” in economics, “sociograms,” “road maps,” “telecommunication 
networks,” and so on. 
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Digraphs (Directed Graphs) 


Nets of one-way streets, pipeline networks, sequences of jobs in construction work, flows 
of computation in a computer, producer—consumer relations, and many other applications 
suggest the idea of a “digraph” (= directed graph), in which each edge has a direction 
(indicated by an arrow, as in Fig. 479). 


DEFINITION 
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Fig. 479. Digraph 


Digraph (Directed Graph) 


A digraph G = (V, E) is a graph in which each edge e = (i, /) has a direction from 
its “initial point” i to its “terminal point” j. 


Two edges connecting the same two points i, 7 are now permitted, provided they have 
opposite directions, that is, they are (i, 7) and (j, i). Example. (1, 4) and (4, 1) in Fig. 479. 

A subgraph or subdigraph of a given graph or digraph G = (V, E), respectively, is a 
graph or digraph obtained by deleting some of the edges and vertices of G, retaining the 
other edges of G (together with their pairs of endpoints). For instance, e;, e3 (together 
with the vertices 1, 2, 4) form a subgraph in Fig. 477, and eg, e4, es (together with the 
vertices 1, 3, 4) form a subdigraph in Fig. 479. 


Computer Representation of Graphs and Digraphs 


Drawings of graphs are useful to people in explaining or illustrating specific situations. 
Here one should be aware that a graph may be sketched in various ways; see Fig. 480. 
For handling graphs and digraphs in computers, one uses matrices or lists as appropriate 
data structures, as follows. 


(a) (b) (c) 
Fig. 480. Different sketches of the same graph 


Adjacency Matrix of a Graph G:_ Matrix A = [a;;] with entries 


1 if G has an edge (i, j), 
ay = 


0 else. 


Thus a;; = | if and only if two vertices i and j are adjacent in G. Here, by definition, no 
vertex is considered to be adjacent to itself; thus, a;; = 0. A is symmetric, aj; = aj;. (Why?) 

The adjacency matrix of a graph is generally much smaller than the so-called incidence 
matrix (see Prob. 18) and is preferred over the latter if one decides to store a graph in a 
computer in matrix form. 
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EXAMPLE 1 Adjacency Matrix of a Graph 


Vertex 1 2 3 4 
Vertex | 0 1 0 1 
2 1 0 1 1 

3 0 1 0 1 


4.1 1 1 + 0 2 


Adjacency Matrix of a Digraph G: Matrix A = [a;;] with entries 


{" if G has a directed edge (i, j), 
aj = 


0 else. 


This matrix A need not be symmetric. (Why?) 


EXAMPLE 2. Adjacency Matrix of a Digraph 


To vertex 1 2 3 4 


From vertex | 0 1 0 0 
2 1 0 0) 1 
3 0 1 0 0 
4 L0 0 0 0 a 


Lists. The vertex incidence list of a graph shows, for each vertex, the incident edges. 
The edge incidence list shows for each edge its two endpoints. Similarly for a digraph; 
in the vertex list, outgoing edges then get a minus sign, and in the edge list we now have 
ordered pairs of vertices. 


EXAMPLE 3_ Vertex Incidence List and Edge Incidence List of a Graph 


This graph is the same as in Example 1, except for notation. 


Vertex Incident Edges Edge Endpoints 
UY, @4;,€5 ey U1, V2 
v2 €1, €2, €3 e2 U2, U3 
v3 C2, €4 e3 U2, V4 
v4 €3, C4, C5 €4 U3, V4 

&5 U1, V4 
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Sparse graphs are graphs with few edges (far fewer than the maximum possible number 
n(n — 1)/2, where n is the number of vertices). For these graphs, matrices are not efficient. 
Lists then have the advantage of requiring much less storage and being easier to handle; 
they can be ordered, sorted, or manipulated in various other ways directly within the 
computer. For instance, in tracing a “walk” (a connected sequence of edges with pairwise 
common endpoints), one can easily go back and forth between the two lists just discussed, 
instead of scanning a large column of a matrix for a single 1. 

Computer science has developed more refined lists, which, in addition to the actual 
content, contain “pointers” indicating the preceding item or the next item to be scanned 
or both items (in the case of a “walk”: the preceding edge or the subsequent one). For 


details, see Refs. [E16] and [F7]. 


This section was devoted to basic concepts and notations needed throughout this chapter, 
in which we shall discuss some of the most important classes of combinatorial optimization 
problems. This will at the same time help us to become more and more familiar with 


graphs and digraphs. 


PROBLEEM—SET 23-1 


. Explain how the following can be regarded as a graph 


or a digraph: a family tree, air connections between 
given cities, trade relations between countries, a tennis 
tournament, and memberships of some persons in some 
committees. 


. Sketch the graph consisting of the vertices and edges 


of a triangle. Of a pentagon. Of a tetrahedron. 


. How would you represent a net of two-way and one- 


way streets by a digraph? 


. Worker W, can do jobs Jy, J3, Ja, worker Wo job Jz, 


and worker W3 jobs Jo, J3, J4. Represent this by a 
graph. 


. Find further situations that can be modeled by a graph 


or diagraph. 


ADJACENCY MATRIX 


6. 
vis 


Show that the adjacency matrix of a graph is symmetric. 


When will the adjacency matrix of a digraph be 
symmetric? 


8-13 


Find the adjacency matrix of the given graph or 


digraph. 


14. 


16. 


0 1 0 1 0 1 0 0 
1 0 1 0 1 0 0 0 
15. 

0 1 0 0 0 0 0 1 
1 0 0 0 0 0 1 0 


Complete graph. Show that a graph G with n vertices 
can have at most n(n — 1)/2 edges, and G has exactly 
n(n — 1)/2 edges if G is complete, that is, if every pair 
of vertices of G is joined by an edge. (Recall that loops 
and multiple edges are excluded.) 


SEC. 23.2 Shortest Path Problems. Complexity 


17. In what case are all the off-diagonal entries of the 
adjacency matrix of a graph G equal to one? 


18. Incidence matrix B of a graph. The definition is 
B = [bj], where 


ba. = 


i if vertex j is an endpoint of edge e;,, 
i 


0 otherwise. 


Find the incidence matrix of the graph in Prob. 8. 


23.2 Shortest Path Problems. 
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19. Incidence matrix B of a digraph. The definition is 
B = [bj], where 


—1. if edge e leaves vertex j, 


if edge e, enters vertex j, 


0 otherwise. 


Find the incidence matrix of the digraph in Prob. 11. 


20. Make the vertex incidence list of the digraph in Prob. 11. 


Complexity 


The rest of this chapter is devoted to the most important classes of problems of 
combinatorial optimization that can be represented by graphs and digraphs. We selected 
these problems because of their importance in applications, and present their solutions 
in algorithmic form. Although basic ideas and algorithms will be explained and 
illustrated by small graphs, you should keep in mind that real-life problems may often 
involve many thousands or even millions of vertices and edges. Think of computer 
networks, telephone networks, electric power grids, worldwide air travel, and companies 
that have offices and stores in all larger cities. You can also think of other ideas for 
networks related to the Internet, such as electronic commerce (networks of buyers and 
sellers of goods over the Internet) and social networks and related websites, such as 
Facebook. Hence reliable and efficient systematic methods are an absolute necessity— 
solutions by trial and error would no longer work, even if “nearly optimal” solutions 
were acceptable. 

We begin with shortest path problems, as they arise, for instance, in designing shortest 
(or least expensive, or fastest) routes for a traveling salesman, for a cargo ship, etc. Let 
us first explain what we mean by a path. 

In a graph G = (V, E) we can walk from a vertex v1 along some edges to some other 
vertex Uz. Here we can 


(A) make no restrictions, or 


(B) require that each edge of G be traversed at most once, or 


(C) require that each vertex be visited at most once. 


In case (A) we call this a walk. Thus a walk from v, to U;, is of the form 


(1) (Vy, VQ), (Va, U3), a) (Ux-1; Uk)» 


where some of these edges or vertices may be the same. In case (B), where each edge 
may occur at most once, we call the walk a trail. Finally, in case (C), where each vertex 
may occur at most once (and thus each edge automatically occurs at most once), we call 
the trail a path. 

We admit that a walk, trail, or path may end at the vertex it started from, in which case 
we call it closed; then v;, = Vv, in (1). 
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A closed path is called a cycle. A cycle has at least three edges (because we do not 
have double edges; see Sec. 23.1). Figure 481 illustrates all these concepts. 


fT 
© (4) 3) 
Fig. 481. Walk, trail, path, cycle 


1—2-—3-— 2isa walk (not a trail). 
4—1-—2-—3-4-— 5isa trail (not a path). 
1—2-—3-—4-— 5isa path (not a cycle). 
1—2-—3-—4-lisacycle. 


Shortest Path 


To define the concept of a shortest path, we assume that G = (V, E) is a weighted graph, 
that is, each edge (v;, v;) in G has a given weight or length 1,; > 0. Then a shortest path 
V1 — Ux (with fixed v1 and v;,) is a path (1) such that the sum of the lengths of its edges 


ly + log + Igq +++ + Up-1k 


(J42 = length of (vy, Vg), etc.) is minimum (as small as possible among all paths from 
Vy to U;). Similarly, a longest path v; — v;, is one for which that sum is maximum. 
Shortest (and longest) path problems are among the most important optimization problems. 
Here, “length” /;; (often also called “cost” or “weight’”) can be an actual length measured 
in miles or travel time or fuel expenses, but it may also be something entirely different. 
For instance, the traveling salesman problem requires the determination of a shortest 
Hamiltonian’ cycle in a graph, that is, a cycle that contains all the vertices of the graph. 
In more detail, the traveling salesman problem in its most basic and intuitive form can 
be stated as follows. You have a salesman who has to drive by car to his customers. He 
has to drive to n cities. He can start at any city and after completion of the trip he has to 
return to that city. Furthermore, he can only visit each city once. All the cities are linked by 
roads to each other, so any city can be visited from any other city directly, that is, if he 
wants to go from one city to another city, there is only one direct road connecting those two 
cities. He has to find the optimal route, that is, the route with the shortest total mileage for 
the overall trip. This is a classic problem in combinatorial optimization and comes up in 
many different versions and applications. The maximum number of possible paths to be 
examined in the process of selecting the optimal path for n cities is (n — 1)!/2, because, 
after you pick the first city, you have n — | choices for the second city, n — 2 choices for 
the third city, etc. You get a total of (n — 1)! (see Sec. 24.4). However, since the mileage 
does not depend on the direction of the tour (e.g., for n = 4 (four cities 1, 2, 3, 4), the tour 
1—2-3-4-1 has the same mileage as 14—3-—2-1, etc., so that we counted all the tours twice!), 
the final answer is (n — 1)!/2. Even for a small number of cities, say n = 15, the maximum 
number of possible paths is very large. Use your calculator or CAS to see for yourself! This 
means that this is a very difficult problem for larger n and typical of problems in 
combinatorial optimization, in that you want a discrete solution but where it might become 
nearly impossible to explicitly search through all the possibilities and therefore some 
heuristics (rules of thumbs, shortcuts) might be used, and a less than optimal answer suffices. 


WILLIAM ROWAN HAMILTON (1805-1865), Irish mathematician, known for his work in dynamics. 
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A variation of the traveling salesman problem is the following. By choosing the “most 
profitable” route vy — vz, a salesman may want to maximize Dhigs where /;; is his expected 
commission minus his travel expenses for going from town i to town /j. 

In an investment problem, i may be the day an investment is made, 7 the day it matures, 
and /;; the resulting profit, and one gets a graph by considering the various possibilities 
of investing and reinvesting over a given period of time. 


Shortest Path If All Edges Have Length | = 1 


Obviously, if all edges have length /, then a shortest path v; vy, is one that has the 
smallest number of edges among all paths v; — v,; in a given graph G. For this problem 
we discuss a BFS algorithm. BFS stands for Breadth First Search. This means that in 
each step the algorithm visits all neighboring (all adjacent) vertices of a vertex reached, 
as opposed to a DFS algorithm (Depth First Search algorithm), which makes a long trail 
(as in a maze). This widely used BFS algorithm is shown in Table 23.1. 


We want to find a shortest path in G from a vertex s (start) to a vertex t (terminal). To 
guarantee that there is a path from s to t, we make sure that G does not consist of separate 
portions. Thus we assume that G is connected, that is, for any two vertices v and w there 
is a path v ~w in G. (Recall that a vertex v is called adjacent to a vertex u if there is 
an edge (u, v) in G.) 


Table 23.1 Moore’s” BFS for Shortest Path (All Lengths One) 


Proceedings of the International Symposium for Switching Theory, Part II. pp. 285-292. Cambridge: Harvard 
University Press, 1959. 


ALGORITHM MOORE [G = (V, E), s, ¢] 


This algorithm determines a shortest path in a connected graph G = (V, E) from a vertex 
5 to a vertex f. 


INPUT: Connected graph G = (V, E), in which one vertex is denoted by s and 
one by 1, and each edge (i, j) has length /,; = 1. Initially all vertices are 
unlabeled. 


OUTPUT: A shortest path s > tin G = (V, E) 


1. Label s with 0. 
2. Set i = 0. 
3. Find all unlabeled vertices adjacent to a vertex labeled i. 
4. Label the vertices just found with i + 1. 
5. If vertex ¢ is labeled, then “backtracking” gives the shortest path 
k (= label of ), k — 1,k — 2,---,0 
OUTPUT k,k — 1,k — 2,---,0. Stop 
Else increase i by 1. Go to Step 3. 
End MOORE 


2EDWARD FORREST MOORE (1925-2003), American mathematician and computer scientist, who did 
pioneering work in theoretical computer science (automata theory, Turing machines). 
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Application of Moore’s BFS Algorithm 
Find a shortest path s + in the graph G shown in Fig. 482. 


Solution. Figure 482 shows the labels. The blue edges form a shortest path (length 4). There is another 
shortest path s — ft. (Can you find it?) Hence in the program we must introduce a rule that makes backtracking 
unique because otherwise the computer would not know what to do next if at some step there is a choice (for 
instance, in Fig. 482 when it got back to the vertex labeled 2). The following rule seems to be natural. 


Backtracking rule. Using the numbering of the vertices from | to n (not the labeling!), at each step, if a 
vertex labeled 7 is reached, take as the next vertex that with the smallest number (not label!) among all the 
vertices labeled i — 1. 


Fig. 482. Example 1, given graph and result of labeling 


Complexity of an Algorithm 


Complexity of Moore’s algorithm. To find the vertices to be labeled 1, we have to scan 
all edges incident with s. Next, wheni = 1, we have to scan all edges incident with vertices 
labeled 1, etc. Hence each edge is scanned twice. These are 2m operations (m = number of 
edges of G). This is a function c(m). Whether it is 2 or 5m + 3 or 12m is not so essential; 
it is essential that c(m) is proportional to m (not m?, for example); it is of the “order” m. 
We write for any function am + b simply O(m), for any function am? + bm + d simply 
O(m?), and so on; here, O suggests order. The underlying idea and practical aspect are 
as follows. 


In judging an algorithm, we are mostly interested in its behavior for very large problems 
(large m in the present case), since these are going to determine the limits of the 
applicability of the algorithm. Thus, the essential item is the fastest growing term 
(am? in am” + bm + d, etc.) since it will overwhelm the others when m is large enough. 
Also, a constant factor in this term is not very essential; for instance, the difference between 
two algorithms of orders, say, 5m” and 8m” is generally not very essential and can be 
made irrelevant by a modest increase in the speed of computers. However, it does make 
a great practical difference whether an algorithm is of order m or m” or of a still higher 
power m?. And the biggest difference occurs between these “polynomial orders” and 
“exponential orders,” such as 2 


For instance, on a computer that does 10° operations per second, a problem of size 

= 50 will take 0.3 sec with an algorithm that requires m° operations, but 13 days with 
an algorithm that requires 2”’ operations. But this is not our only reason for regarding 
polynomial orders as good and exponential orders as bad. Another reason is the gain in 
using a faster computer. For example, let two algorithms be O(@m) and O(m 2) Then, since 
1000 = 31.6”, an increase in speed by a factor 1000 has the effect that per hour we can 
do problems 1000 and 31.6 times as big, respectively. But since 1000 = 2°97 with an 
algorithm that is O(2™), all we gain is a relatively modest increase of 10 in problem size 
becaise 2?" "a2 = 9787 
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The symbol O is quite practical and commonly used whenever the order of growth is 
essential, but not the specific form of a function. Thus if a function g(m) is of the form 


g(m) = kh(m) + more slowly growing terms (k # 0, constant), 
we say that g(m) is of the order h(m) and write 
g(m) = O(h(n)). 
For instance, 
am+b=O(m), am? +bm+d=O(m), 5-2 + 3m? = 02. 


We want an algorithm & to be “efficient,” that is, “good” with respect to 


(i) Time (number c,(m) of computer operations), or 


(ii) Space (storage needed in the internal memory) 
or both. Here cy suggests “complexity” of «4. Two popular choices for cq are 


(Worst case) c,4(m) = longest time & takes for a problem of size m, 


(Average case) c,4(m) = average time & takes for a problem of size m. 


In problems on graphs, the “size” will often be m (number of edges) or n (number of 
vertices). For Moore’s algorithm, c.,4(m) = 2m in both cases. Hence the complexity of 
Moore’s algorithm is of order O(7m). 

For a “good” algorithm “, we want that c,,(m) does not grow too fast. Accordingly, 
we call efficient if c(m) = O(m hy for some integer k = 0; that is, c.g may contain 
only powers of m (or functions that grow even more slowly, such as In m), but no 
exponential functions. Furthermore, we call 4 polynomially bounded if © is efficient 
when we choose the “worst case” c,4(m). These conventional concepts have intuitive 
appeal, as our discussion shows. 

Complexity should be investigated for every algorithm, so that one can also compare 
different algorithms for the same task. This may often exceed the level in this chapter; 
accordingly, we shall confine ourselves to a few occasional comments in this direction. 


PROBLEM SET 23-2 


SHORTEST PATHS, MOORE’S BFS 3. t 4. 
(All edges length one) t 


1-4| Find a shortest path P: s—t and its length by 

Moore’s algorithm. Sketch the graph with the labels and 

indicate P by heavier lines as in Fig. 482. 5. Moore’s algorithm. Show that if vertex v has label 
LL r 2. ‘ A(v) = k, then there is a path sv of length k. 


6. Maximum length. What is the maximum number of 
s edges that a shortest path between any two vertices in 
a graph with n vertices can have? Give a reason. In a 

t complete graph with all edges of length 1? 
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7. Nonuniqueness. Find another shortest path from s to 
t in Example 1 of the text. 

8. Moore’s algorithm. Call the length of a shortest path 
s—v the distance of v from s. Show that if v has 
distance /, it has label A(v) = 1. 

9. CAS PROBLEM. Moore’s Algorithm. Write a 
computer program for the algorithm in Table 23.1. Test 
the program with the graph in Example 1. Apply it to 
Probs. 1-3 and to some graphs of your own choice. 


10-12} HAMILTONIAN CYCLE 


10. Find and sketch a Hamiltonian cycle in the graph of a 
dodecahedron, which has 12 pentagonal faces and 20 
vertices (Fig. 483). This is a problem Hamilton himself 
considered. 


Fig. 483. Problem 10 


11. Find and sketch a Hamiltonian cycle in Prob. 1. 
12. Does the graph in Prob. 4 have a Hamiltonian cycle? 


13-14 | POSTMAN PROBLEM 


13. The postman problem is the problem of finding a 
closed walk W: ss (s the post office) in a graph G 
with edges (i, j) of length /;; > 0 such that every edge 
of G is traversed at least once and the length of W is 
minimum. Find a solution for the graph in Fig. 484 by 
inspection. (The problem is also called the Chinese 
postman problem since it was published in the journal 
Chinese Mathematics | (1962), 273-277.) 


23.3 Bellman’s Principle. 


Fig. 484. 


Problem 13 


14. Show that the length of a shortest postman trail is the 
same for every starting vertex. 


15-17| EULER GRAPHS 


15. An Euler graph G is a graph that has a closed Euler 
trail. An Euler trail is a trail that contains every edge 
of G exactly once. Which subgraph with four edges of 
the graph in Example 1, Sec. 23.1, is an Euler graph? 


16. Find four different closed Euler trails in Fig. 485. 


2 4 


1 3 5 
Fig. 485. Problem 16 


17. Is the graph in Fig. 484 an Euler graph. Give reason. 


ORDER 


18. Show that O(m?) + O(m?) = O(m?) and kO(m”) = 
O(m?),. 

19. Show that V1 + m2 = O(m),0.02e™ + 100m? = 
Ove”). 

20. If we switch from one computer to another that is 100 
times as fast, what is our gain in problem size per hour 
in the use of an algorithm that is O(m), O(m 2) O(m?), 
O(e”)? 


Dijkstra’s Algorithm 


We continue our discussion of the shortest path problem in a graph G. The last section 
concerned the special case that all edges had length |. But in most applications the edges 
(i, j) will have any lengths /;; > 0, and we now turn to this general case, which is of 
greater practical importance. We write /;; = % for any edge (i, j) that does not exist in G 
(setting © + a = ~ for any number a, as usual). 

We consider the problem of finding shortest paths from a given vertex, denoted by | 
and called the origin, to all other vertices 2, 3,---, n of G. We let L; denote the length 


of a shortest path PF: | >; in G. 
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THEOREM -1 


PROOF 


Bellman’s Minimality Principle or Optimality Principle® 


If P,;; 1 jis a shortest path from | to j in G and (i, j) is the last edge of P; (Fig. 486), 
then P,; | — i [obtained by dropping (i, j) from P,] is a shortest path | >i. 


Fig. 486. Paths P and P; in Bellman’s minimality principle 


Suppose that the conclusion is false. Then there is a path P}: 1 — i that is shorter than 
P;. Hence, if we now add (i, j) to P#, we get a path 1 —j that is shorter than P,. This 
contradicts our assumption that P; is shortest. a 


From Bellman’s principle we can derive basic equations as follows. For fixed 7 we may 
obtain various paths 1 —j by taking shortest paths P, for various i for which there is in 
G an edge (i, j), and add (i, 7) to the corresponding P;. These paths obviously have lengths 
L; + 1 (LZ; = length of P;). We can now take the minimum over i, that is, pick an i for 
which L; + J;; is smallest. By the Bellman principle, this gives a shortest path 1 ~j. It 
has the length 


Io = 0 
1 Se say 
(1) L; = min (L; + Ly), : 
iF] 


These are the Bellman equations. Since /;; = 0 by definition, instead of min;,; we can 
simply write min;. These equations suggest the idea of one of the best-known algorithms 
for the shortest path problem, as follows. 


Dijkstra’s Algorithm for Shortest Paths 


Dijkstra’s* algorithm is shown in Table 23.2, where a connected graph G is a graph in 

which, for any two vertices v and w in G, there is a path v ~w. The algorithm is a 

labeling procedure. At each stage of the computation, each vertex v gets a label, either 
(PL) a permanent label = length L, of a shortest path 1 —v 


or 


(TL) a temporary label = upper bound L,, for the length of a shortest path 1 > v. 


3RICHARD BELLMAN ( 1920-1984), American mathematician, known for his work in dynamic programming. 
4EDSGER WYBE DIJKSTRA (1930-2002), Dutch computer scientist, 1972 recipient of the ACM Turing 
Award. His algorithm appeared in Numerische Mathematik 1 (1959), 269-271. 
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We denote by P¥ and J £& the sets of vertices with a permanent label and with a temporary 
label, respectively. The algorithm has an initial step in which vertex | gets the permanent 
label L, = O and the other vertices get temporary labels, and then the algorithm alternates 
between Steps 2 and 3. In Step 2 the idea is to pick k “minimally.” In Step 3 the idea is 
that the upper bounds will in general improve (decrease) and must be updated accordingly. 
Namely, the new temporary label L; of vertex j will be the old one if there is no 
improvement or it will be L;, + 1); if there is. 


Table 23.2 Dijkstra’s Algorithm for Shortest Paths 


ALGORITHM DIJKSTRA [G = (V, £), V = {1,°°-, }, Jj; for all (i, j) in E] 


Given a connected graph G = (V, E) with vertices 1, ---, nm and edges (i, j) having 
lengths /;; > 0, this algorithm determines the lengths of shortest paths from vertex | to 
the vertices 2,---,n. 


INPUT: Number of vertices n, edges (i, j), and lengths J/;, 
OUTPUT: Lengths L; of shortest paths 1 > j,j = 2,°--+,n 
1. Initial step 


Vertex 1 gets PL: L, = 0. a 
Vertex j (= 2,--+,n) gets TL: L; = 11; (= © if there is no edge (1, j) in G). 
Set PL = {1}, TL = {2,3,--+,n}. 


2. Fixing a permanent label 


Find a k in J£& for which Ly is miminum, set L; = Ly. Take the smallest k if 
there are several. Delete k from J and include it in PL. 
If TL = © (that is, TF is empty) then 


OUTPUT Ly, ---, L,. Stop 
Else continue (that is, go to Step 3). 
3. Updating temporary labels 


For all j in TY, set L; = min, {Lis L,, + 1,3} (that is, take the smaller of L; and 
L,, + Ui, as your new L,). 


Go to Step 2. 
End DIJKSTRA 


Application of Dijkstra’s Algorithm 
Applying Dijkstra’s algorithm to the graph in Fig. 487a, find shortest paths from vertex 1 to vertices 2, 3, 4. 
Solution. We list the steps and computations. 
1. “2 0, Ts =] 8 l= 5, = 7, PL = {1}, TL = {2, 3,4} 
2. Ls = min {Lz, Ls, L4} = 5, k = 3, PL = {1,3}, TL = {2,4} 
3. Ly = min {8, Ls + [39} = min {8,5 + 1} =6 
La = min {7, Ls + I34} = min {7, ©} = 7 
2. Ly = min {Lo, L4} = min {6,7} = 6, k = 2, PL = {1, 2, 3}, TL = {4} 
3. La = min {7, Ly + loa} = min {7,6 + 2} =7 
2. La=7,k =4 PL = {1, 2,3, 4}, TL = ©. 
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Figure 487b shows the resulting shortest paths, of lengths Ly = 6, L3 = 5, L4 = 7. i] 


(a) Given graph G (b) Shortest paths in G 
Fig. 487. Example 1 


Complexity. Dijkstra’s algorithm is O(n”). 
PROOF Step 2 requires comparison of elements, first n — 2, the next time n — 3, etc., a total 
of (n — 2)(n — 1)/2. Step 3 requires the same number of comparisons, a total of 


(n — 2)(n — 1)/2, as well as additions, first n — 2, the next time n — 3, etc., again a total of 
(n — 2)(n — )/2. Hence the total number of operations is 3(n — 2)(n — 1)/2 = O(n”). I 


PROBLEM SET 23-3 


1. The net of roads in Fig. 488 connecting four villages 5. 
is to be reduced to minimum length, but so that one 
can still reach every village from every other village. 
Which of the roads should be retained? Find the 
solution (a) by inspection, (b) by Dijkstra’s algorithm. 


Fig. 488. Problem 1 


2. Show that in Dijkstra’s algorithm, for L;, there is a path 
P:1—k of length Lr. 7. 

3. Show that in Dijkstra’s algorithm, at each instant the 
demand on storage is light (data for fewer than n edges). 


4-9| DIJKSTRA’S ALGORITHM 


For each graph find the shortest paths. 
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23.4 Shortest Spanning Trees: Greedy Algorithm 


So far we have discussed shortest path problems. We now turn to a particularly important 
kind of graph, called a tree, along with related optimization problems that arise quite 
often in practice. 

By definition, a tree T is a graph that is connected and has no cycles. “Connected” 
was defined in Sec. 23.3; it means that there is a path from any vertex in T to any other 
vertex in T. A cycle is a path s —t of at least three edges that is closed (t = s); see also 
Sec. 23.2. Figure 489a shows an example. 


CAUTION! The terminology varies; cycles are sometimes also called circuits. 


A spanning tree 7 in a given connected graph G = (V, £) is a tree containing all the 
n vertices of G. See Fig. 489b. Such a tree has n — 1 edges. (Proof?) 

A shortest spanning tree 7 in a connected graph G (whose edges (i, j) have lengths 
[,; > 0) is a spanning tree for which a1 (sum over all edges of 7) is minimum compared 
to X/,; for any other spanning tree in G. 


(a) (b) 
Fig. 489. Example of (a) a cycle, (b) a spanning tree in a graph 


Trees are among the most important types of graphs, and they occur in various 
applications. Familiar examples are family trees and organization charts. Trees can be 
used to exhibit, organize, or analyze electrical networks, producer—consumer and other 
business relations, information in database systems, syntactic structure of computer 
programs, etc. We mention a few specific applications that need no lengthy additional 
explanations. 

The set of shortest paths from vertex | to the vertices 2,---, 7 in the last section forms 
a spanning tree. 

Railway lines connecting a number of cities (the vertices) can be set up in the form of 
a spanning tree, the “length” of a line (edge) being the construction cost, and one wants 
to minimize the total construction cost. Similarly for bus lines, where “length” may be 
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EXAMPLE 1 


the average annual operating cost. Or for steamship lines (freight lines), where “length” 
may be profit and the goal is the maximization of total profit. Or in a network of telephone 
lines between some cities, a shortest spanning tree may simply represent a selection of 
lines that connect all the cities at minimal cost. In addition to these examples we could 
mention others from distribution networks, and so on. 

We shall now discuss a simple algorithm for the problem of finding a shortest spanning 
tree. This algorithm (Table 23.3) is particularly suitable for sparse graphs (graphs with 
very few edges; see Sec. 23.1). 


Table 23.3. Kruskal’s® Greedy Algorithm for Shortest Spanning Trees 
Proceedings of the American Mathematical Society 7 (1956), 48-50. 


ALGORITHM KRUSKAL [G = (V, E), [,; for all G, j) in E] 
Given a connected graph G = (V, E) with vertices 1, 2, +--+, n and edges (i, j) having 
length /,; > 0, the algorithm determines a shortest spanning tree T in G. 

INPUT: Edges (i, j) of G and their lengths /;; 

OUTPUT: Shortest spanning tree T in G 


1. Order the edges of G in ascending order of length. 
2. Choose them in this order as edges of T, rejecting an edge only if it forms a 
cycle with edges already chosen. 


If n — 1 edges have been chosen, then 
OUTPUT T (= the set of edges chosen). Stop 


End KRUSKAL 


Application of Kruskal’s Algorithm 


Using Kruskal’s algorithm, we shall determine a shortest spanning tree in the graph in Fig. 490. 


Table 23.4 Solution in Example 1 


Edge Length Choice 
(3, 6) 1 Ist 
(1, 2) 2 2nd 
(1, 3) 4 3rd 
Fig. 490. Graph in Example 1 @,9) 6 sis 
(2, 3) 7 Reject 
(3, 4) 8 5th 
(5, 6) 9 
(2, 4) 11 


Solution. See Table 23.4. In some of the intermediate stages the edges chosen form a disconnected graph 
(see Fig. 491); this is typical. We stop after n — 1 = 5 choices since a spanning tree has n — | edges. In our 
problem the edges chosen are in the upper part of the list. This is typical of problems of any size; in general, 
edges farther down in the list have a smaller chance of being chosen. 5] 


®5JOSEPH BERNARD KRUSKAL (1928-— ), American mathematician who worked at Bell Laboratories. 
He is known for his contributions to graph theory and statistics. 
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The efficiency of Kruskal’s method is greatly increased by double labeling of 
vertices. 


Double Labeling of Vertices. Each vertex i carries a double label (rj, p;), where 


1, = Root of the subtree to which i belongs, 
Pi = Predecessor of i in its subtree, 


pi = 0 for roots. 
This simplifies rejecting. 


Rejecting. If (i, j) is next in the list to be considered, reject (i, j) if r; = r; (that is, i and 
j are in the same subtree, so that they are already joined by edges and (i, 7) would thus 
create a cycle). If r; # rj, include (i, j) in T. 

If there are several choices for r;, choose the smallest. If subtrees merge (become a 
single tree), retain the smallest root as the root of the new subtree. 


For Example | the double-label list is shown in Table 23.5. In storing it, at each instant 
one may retain only the latest double label. We show all double labels in order to exhibit 
the process in all its stages. Labels that remain unchanged are not listed again. 
Underscored are the two 1|’s that are the common root of vertices 2 and 3, the reason for 
rejecting the edge (2, 3). By reading for each vertex the latest label we can read from 
this list that 1 is the vertex we have chosen as a root and the tree is as shown in the last 
part of Fig. 491. 


S \ - 2 


First Second Third Fourth Fifth 


Fig. 491. Choice process in Example 1 


Table 23.5 List of Double Labels in Example 1 


Choice 1 Choice 2 Choice 3 Choice 4 Choice 5 

Vertex (3, 6) (Gl, 2) Gl, 3) 4, 5) (3, 4) 

1 d, 0) 

2 d, 1) 

3 (3, 0) (1, 1) 

4 (4, 0) (1, 3) 

) (4, 4) (1, 4) 

6 (3, 3) (1, 3) 
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This is made possible by the predecessor label that each vertex carries. Also, for accepting 
or rejecting an edge we have to make only one comparison (the roots of the two endpoints 
of the edge). 

Ordering is the more expensive part of the algorithm. It is a standard process in 
data processing for which various methods have been suggested (see Sorting in Ref. 
[E25] listed in App. 1). For a complete list of m edges, an algorithm would be 
O(m logs m), but since the n — | edges of the tree are most likely to be found earlier, 
by inspecting the g (< m) topmost edges, for such a list of g edges one would have 


O(q loge m). 


PROBLEM SET 23-4 


1-6| KRUSKAL’S GREEDY ALGORITHM 5. 


Find a shortest spanning tree by Kruskal’s algorithm. 
Sketch it. 


7. CAS PROBLEM. Kruskal’s Algorithm. Write a 
corresponding program. (Sorting is discussed in Ref. 
[E25] listed in App. 1.) 


8. To get a minimum spanning tree, instead of adding 
shortest edges, one could think of deleting longest 
edges. For what graphs would this be feasible? 
Describe an algorithm for this. 


9. Apply the method suggested in Prob. 8 to the graph in 
Example 1. Do you get the same tree? 


10. Design an algorithm for obtaining longest spanning 
trees. 


11. Apply the algorithm in Prob. 10 to the graph in 
Example 1. Compare with the result in Example 1. 


12. Forest. A (not necessarily connected) graph without 
cycles is called a forest. Give typical examples of 
applications in which graphs occur that are forests or 
trees. 
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Dallas Denver Los Angeles New York Washington, DC 
Chicago 800 900 1800 700 650 
Dallas 650 1300 1350 1200 
Denver 850 1650 1500 
Los Angeles 2500 2350 
New York 200 
13. Air cargo. Find a shortest spanning tree in the 16. If a graph has no cycles, it must have at least 2 vertices 


complete graph of all possible 15 connections between 
the six cities given (distances by airplane, in miles, 
rounded). Can you think of a practical application of 
the result? 


14-20; GENERAL PROPERTIES OF TREES 
Prove the following. Hint. Use Prob. 14 in proving 15 and 


18; 
14. 


15. 


use Probs. 16 and 18 in proving 20. 


Uniqueness. The path connecting any two vertices u 
and vu in a tree is unique. 


If in a graph any two vertices are connected by a unique 
path, the graph is a tree. 


17. 


18. 


19. 


20. 


of degree | (definition in Sec. 23.1). 


A tree with exactly two vertices of degree 1 must be a 
path. 


A tree with n vertices has n — 1 edges. (Proof by 
induction.) 


If two vertices in a tree are joined by a new edge, a 
cycle is formed. 


A graph with n vertices is a tree if and only if it has 
n — | edges and has no cycles. 


23.5 Shortest Spanning Trees: 


Prim’s Algorithm 


Prim’s® algorithm, shown in Table 23.6, is another popular algorithm for the shortest 
spanning tree problem (see Sec. 23.4). This algorithm avoids ordering edges and gives a 
tree T at each stage, a property that Kruskal’s algorithm in the last section did not have 
(look back at Fig. 491 if you did not notice it). 

In Prim’s algorithm, starting from any single vertex, which we call 1, we “grow” the 
tree T by adding edges to it, one at a time, according to some rule (in Table 23.6) until 
T finally becomes a spanning tree, which is shortest. 

We denote by U the set of vertices of the growing tree T and by S the set of its edges. 
Thus, initially U = {1} and S = ©; at the end, U = V, the vertex set of the given graph 
G = (V, E), whose edges (i, j) have length /;; > 0, as before. 


SROBERT CLAY PRIM (1921- ), American computer scientist at General Electric, Bell Laboratories, and 


Sandia National Laboratories. 
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Thus at the beginning (Step 1) the labels 
yo,°*', An of the vertices 2,:°°,n 


are the lengths of the edges connecting them to vertex | (or ~ if there is no such edge in 
G). And we pick (Step 2) the shortest of these as the first edge of the growing tree T and 
include its other end j in U (choosing the smallest j if there are several, to make the process 
unique). Updating labels in Step 3 (at this stage and at any later stage) concerns each 
vertex k not yet in U. Vertex k has label Ay, = liq), from before. If Jj, < Ax, this means 
that k is closer to the new member j just included in U than k is to its old “closest neighbor” 
i(k) in U. Then we update the label of k, replacing Ax = ligg,x by Ax = Ljx and setting 
i(k) = j. If, however, lj, 2 Aj, (the old label of k), we don’t touch the old label. Thus the 
label A;, always identifies the closest neighbor of k in U, and this is updated in Step 3 as 
U and the tree T grow. From the final labels we can backtrack the final tree, and from their 
numeric values we compute the total length (sum of the lengths of the edges) of this tree. 

Prim’s algorithm is useful for computer network design, cable, distribution networks, 
and transportation networks. 


Table 23.6 Prim’s Algorithm for Shortest Spanning Trees 


Bell System Technical Journal 36 (1957), 1389-1401. 
For an improved version of the algorithm, see Cheriton and Tarjan, SIAM Journal on Computation 5 
(1976), 724-742. 


ALGORITHM PRIM [G = (V, E), V = {1, +++, n}, 4; for all (i, j) in E] 

Given a connected graph G = (V, E) with vertices 1, 2, --- , and edges (i, j) having 
length /,; > 0, this algorithm determines a shortest spanning tree T in G and its length 
LT). 


INPUT: n, edges (i, 7) of G and their lengths /;, 
OUTPUT: Edge set S of a shortest spanning tree T in G; L(T) 
Initially, all vertices are unlabeled. | 


1. Initial step 
Set i(k) = 1, U = {1}, S = @. 
Label vertex k (= 2,---, mn) with A; = 1; [= © if G has no edge (1, k)]. 


2. Addition of an edge to the tree T 
Let A; be the smallest A; for vertex k not in U. Include vertex j in U and edge 
(i(j), j) in S. 
If U = V then compute 
L(T) = =1;; (sum over all edges in S) 
OUTPUT S, L(T). Stop 
[S is the edge set of a shortest spanning tree T in G.] 
Else continue (that is, go to Step 3). 


3. Label updating 
For every k not in U, if Jj, < Aj, then set Ay, = 1}; and i(k) = j. 
Go to Step 2. 


End PRIM 
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EXAMPLE 1 Application of Prim’s Algorithm 


so that we can compare). 


Fig. 492. Graph in 
Example 1 


PPR wy WN wD 


Solution. The steps are as follows. 
1. i(k) = 1, U = {1}, S = ©, initial labels see Table 23.7. 
» Ag = ly = 2 is smallest, U = {1,2}, S = {C, 2)}. 


Find a shortest spanning tree in the graph in Fig. 492 (which is the same as in Example 1, Sec. 23.4, 


Update labels as shown in Table 23.7, column (1). 

Ag = I13 = 4 is smallest, U = {1, 2, 3}, S = {(, 2), C, 3)}. 

Update labels as shown in Table 23.7, column (ID). 

Ag = /gg = 1 is smallest, U = {1, 2, 3, 6}, S = {(1, 2), C1, 3), (3, 6)}. 

Update labels as shown in Table 23.7, column (III). 

Aq = Igq = 8 is smallest, U = {1, 2, 3, 4, 6}, S = {C1 2), (1, 3), (3, 4), (3, 6)}. 
Update labels as shown in Table 23.7, column (IV). 

. As = l45 = 6 is smallest, U = V, S = (1, 2), (1, 3), (3, 4), (3, 6), (4, 5). Stop. 


The tree is the same as in Example 1, Sec. 23.4. Its length is 21. You will find it interesting to 
compare the growth process of the present tree with that in Sec. 23.4. a 


Table 23.7 Labeling of Vertices in Example 1 


inttiell Relabeling 
Vertex 
Label () (II) (III) (IV) 
2 ly. = 2 —_— — — — 
3 lig =4 lig =4 = 
4 00 Ing = 11 Izq = 8 Izq = 8 
5 oo o o l65 = 9 lg5 = 6 
6 oe) ioe) Isg = 1 — 
PR-OBLEEM—SET 23-5 
SHORTEST SPANNING TREES. PRIM’S 6-13 | Finda shortest spanning tree by Prim’s algorithm. 
ALGORITHM 6. 
1. When will S = E at the end in Prim’s algorithm? 
2. Complexity. Show that Prim’s algorithm has com- 
plexity O(n”). 
3. What is the result of applying Prim’s algorithm to a 
graph that is not connected? 
7. 


. If for a complete graph (or one with very few edges 


missing), our datais ann X n distance table (as in Prob. 
13, Sec. 23.4), show that the present algorithm [which 
is O(n?)] cannot easily be replaced by an algorithm of 
order less than O(n”). 


. How does Prim’s algorithm prevent the generation of 


cycles as you grow T? 
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10. 
11. 
12. 
13. 


14. 


For the graph in Prob. 6, Sec. 23.4. 


For the graph in Prob. 4, Sec. 23.4. 
For the graph in Prob. 2, Sec. 23.4. 


CAS PROBLEM. Prim’s Algorithm. Write a program 
and apply it to Probs. 6-9. 


TEAM PROJECT. Center of a Graph and Related 
Concepts. (a) Distance, Eccentricity. Call the length 
of a shortest path u—v in a graph G = (V, E) the 


23.6 Flows in Networks 


After shortest path problems and problems for trees, as a third large area in combinatorial 
optimization we discuss flow problems in networks (electrical, water, communication, 
traffic, business connections, etc.), turning from graphs to digraphs (directed graphs; see 


Sec. 23.1). 
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distance d(u,v) from u to v. For fixed u, call the 
greatest d(u, UV) as v ranges over V the eccentricity €(u) 
of u. Find the eccentricity of vertices 1, 2, 3 in the 
graph in Prob. 7. 


(b) Diameter, Radius, Center. The diameter d(G) 
of a graph G = (V, E) is the maximum of d(u, v) as u 
and v vary over V, and the radius r(G) is the smallest 
eccentricity e(v) of the vertices v. A vertex v with 
e(v) = r(G) is called a central vertex. The set of all 
central vertices is called the center of G. Find 
d(G), r(G), and the center of the graph in Prob. 7. 


(c) What are the diameter, radius, and center of the 
spanning tree in Example | of the text? 


(d) Explain how the idea of a center can be used in setting 
up an emergency service facility on a transportation 
network. In setting up a fire station, a shopping center. 
How would you generalize the concepts in the case of two 
or more such facilities? 


(e) Show that a tree T whose edges all have length 1 
has center consisting of either one vertex or two 
adjacent vertices. 


(f) Set up an algorithm of complexity O(n) for finding 
the center of a tree T. 


By definition, a network is a digraph G = (V, E) in which each edge (i, j) has assigned 
to it a capacity c;; > 0 [= maximum possible flow along (i, j)], and at one vertex, s, 
called the source, a flow is produced that flows along the edges of the digraph G to another 
vertex, t, called the target or sink, where the flow disappears. 


In applications, this may be the flow of electricity in wires, of water in pipes, of cars 
on roads, of people in a public transportation system, of goods from a producer to 
consumers, of e-mail from senders to recipients over the Internet, and so on. 

We denote the flow along a (directed!) edge (i, j) by f,; and impose two conditions: 


1. For each edge (i, j) in G the flow does not exceed the capacity cj;, 


(1) 0 = fij = Cy 
2. For each vertex i, not s or ft, 


Inflow = Outflow 


(“Edge condition’’). 


(“Vertex condition,” “Kirchhoff’s law’’); 
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in a formula, 


Oif vertexi # s,i # ft, 


(2) > fei — > fj = \—fat the source s, 
: 
Inflow —— fat the target (sink) t, 


where f is the total flow (and at s the inflow is zero, whereas at t the outflow is zero). 
Figure 493 illustrates the notation (for some hypothetical figures). 


Fig. 493. Notation in (2): inflow and outflow for a vertex j (not s or t) 


Paths 


By a path v; — vu, from a vertex vy to a vertex Uz, in a digraph G we mean a sequence 
of edges 


(Vy, V2), (V2, U3); uy (Ux-1, Uk)» 


regardless of their directions in G, that forms a path as in a graph (see Sec. 23.2). Hence 
when we travel along this path from v, to Uv; we may traverse some edge in its given 
direction—then we call it a forward edge of our path—or opposite to its given direction— 
then we call it a backward edge of our path. In other words, our path consists of one-way 
streets, and forward edges (backward edges) are those that we travel in the right direction 
(in the wrong direction). Figure 494 shows a forward edge (u, v) and a backward edge (w, v) 
of a path v1 > Ux. 


CAUTION! Each edge in a network has a given direction, which we cannot change. 
Accordingly, if (u, v) is a forward edge in a path v; vx, then (u, v) can become a 
backward edge only in another path x; — x, in which it is an edge and is traversed in the 
opposite direction as one goes from x1 to x;; see Fig. 495. Keep this in mind, to avoid 
misunderstandings. 


ASA eo u —— 
v 7 : 
1 Ne x@ ee -—_@U, 


Uv, 1 : v 
Fig. 494. Forward edge (u, v) and Fig. 495. Edge (u, v) as forward edge in the path 
backward edge (w, v) of a path v, > v, Vv, > v, and as backward edge in the path x, — x, 


Flow Augmenting Paths 


Our goal will be to maximize the flow from the source s to the target t of a given network. 
We shall do this by developing methods for increasing an existing flow (including the 
special case in which the latter is zero). The idea then is to find a path P: s— rt all of 
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whose edges are not fully used, so that we can push additional flow through P. This 
suggests the following concept. 


DEFINITION Flow Augmenting Path 


A flow augmenting path in a network with a given flow f;; on each edge (i, j) is a 
path P: st such that 


(i) no forward edge is used to capacity; thus fj; < c;j; for these; 
(ii) no backward edge has flow 0; thus f;; > 0 for these. 


EXAMPLE 1_ Flow Augmenting Paths 


Find flow augmenting paths in the network in Fig. 496, where the first number is the capacity and the second 
number a given flow. 


7,4 
Fig. 496. Network in Example 1 
First number = Capacity, Second number = Given flow 


Solution. In practical problems, networks are large and one needs a systematic method for augmenting 
flows, which we discuss in the next section. In our small network, which should help to illustrate and clarify 
the concepts and ideas, we can find flow augmenting paths by inspection and augment the existing flow f = 9 
in Fig. 496. (The outflow from s is 5 + 4 = 9, which equals the inflow 6 + 3 into t.) 

We use the notation 


Aye = Gy — fy for forward edges 
Ai = fij for backward edges 
A = min Aij taken over all edges of a path. 


From Fig. 496 we see that a flow augmenting path Pj:s—t is Py: 1 —2—3-6 (Fig. 497), with 
Ais = 20 — 5 = 15, etc., and A = 3. Hence we can use P, to increase the given flow 9 to f=9 + 3 = 12. 
All three edges of P, are forward edges. We augment the flow by 3. Then the flow in each of the edges of P, 
is increased by 3, so that we now have fiz = 8 (instead of 5), fo3 = 11 (instead of 8), and fgg = 9 (instead of 
6). Edge (2, 3) is now used to capacity. The flow in the other edges remains as before. 

We shall now try to increase the flow in this network in Fig. 496 beyond f = 12. 

There is another flow augmenting path Py: st, namely, Pz: 1 — 4 — 5 — 3 — 6 (Fig. 497). It shows how 
a backward edge comes in and how it is handled. Edge (3, 5) is a backward edge. It has flow 2, so that Agg = 2. 
We compute Ay4 = 10 — 4 = 6, etc. (Fig. 497) and A = 2. Hence we can use P, for another augmentation to 
get f = 12 + 2 = 14. The new flow is shown in Fig. 498. No further augmentation is possible. We shall confirm 
later that f = 14 is maximum. 


45 As B) A 
dye 36 =7 Path P, 


Fig. 497. Flow augmenting paths in Example 1 
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Cut Sets 


A cut set is a set of edges in a network. The underlying idea is simple and natural. If we 
want to find out what is flowing from s to ¢ in a network, we may cut the network 
somewhere between s and ¢ (Fig. 498 shows an example) and see what is flowing in the 
edges hit by the cut, because any flow from s to t must sometimes pass through some of 
these edges. These form what is called a cut set. [In Fig. 498, the cut set consists of the 
edges (2, 3), (5, 2), (4, 5).] We denote this cut set by (S, 7). Here S is the set of vertices 
on that side of the cut on which s lies ($ = {s, 2,4} for the cut in Fig. 498) and 7 is the 
set of the other vertices (T = {3, 5, t} in Fig. 498). We say that a cut partitions the vertex 
set V into two parts S and T. Obviously, the corresponding cut set (S, T) consists of all 
the edges in the network with one end in S and the other end in T. 


Fig. 498. Maximum flow in Example 1 


By definition, the capacity cap (S, T) of a cut set (S, 7) is the sum of the capacities of 
all forward edges in (S, T) (forward edges only!), that is, the edges that are directed from 
S to T, 


(3) cap (S, T) = Dey [sum over the forward edges of (S, T)]. 
Thus, cap (S, 7) = 11 + 7 = 18 in Fig. 498. 


Explanation. This can be seen as follows. Look at Fig. 498. Recall that for each edge 
in that figure, the first number denotes capacity and the second number flow. Intuitively, 
you can think of the edges as roads, where the capacity of the road is how many cars can 
actually be on the road, and the flow denotes how many cars actually are on the road. To 
compute capacity cap (S$, JT) we are only looking at the first number on the edges. Take 
a look and see that the cut physically cuts three edges, that is, (2, 3), (4, 5), and (5, 2). 
The cut concerns only forward edges that are being cut, so it concerns edges (2, 3) and 
(4, 5) (and does not include edge (5, 2) which is also being cut, but since it goes backwards, 
it does not count). Hence (2, 3) contributes 11 and (4, 5) contributes 7 to the capacity cap 
(S, T), for a total of 18 in Fig. 498. Hence cap (S$, T) = 18. 

The other edges (directed from T to S) are called backward edges of the cut set (S, T), 
and by the net flow through a cut set we mean the sum of the flows in the forward edges 
minus the sum of the flows in the backward edges of the cut set. 


CAUTION! Distinguish well between forward and backward edges in a cut set and in 
a path: (5, 2) in Fig. 498 is a backward edge for the cut shown but a forward edge in the 
pah1—-4-5-2-3-6. 


For the cut in Fig. 498 the net flow is 11 + 6 — 3 = 14. For the same cut in Fig. 496 
(not indicated there), the net flow is 8 + 4 — 3 = 9. In both cases it equals the flow f- 
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THEOREM 1 


PROOF 


THEOREM 2 


PROOF 


We claim that this is not just by chance, but cuts do serve the purpose for which we have 
introduced them: 


Net Flow in Cut Sets 


Any given flow in a network G is the net flow through any cut set (S, T) of G. 


By Kirchhoff’s law (2), multiplied by —1, at a vertex i we have 


ifi # s,t, 


0 
(4) Pe eee -{ 
j 1 f 


— 
Outflow Inflow 


ifi=s. 


Here we can sum over j and / from | to n (= number of vertices) by putting f;; = 0 for 
j = and also for edges without flow or nonexisting edges; hence we can write the two 
sums as one, 


ifi #s,t, 


ifi=s. 


0 
G3-i= { 
j f 
We now sum over all i in S. Since s is in S, this sum equals f: 


(5) VG =s 


iES jEV 


We claim that in this sum, only the edges belonging to the cut set contribute. Indeed, 
edges with both ends in T cannot contribute, since we sum only over i in S; but edges 
(i, 7) with both ends in S contribute +f;; at one end and —f;; at the other, a total contribution 
of 0. Hence the left side of (5) equals the net flow through the cut set. By (5), this is equal 
to the flow f and proves the theorem. i 


This theorem has the following consequence, which we shall also need later in this 
section. 


Upper Bound for Flows 
A flow f in a network G cannot exceed the capacity of any cut set (S, T) in G. 


By Theorem | the flow f equals the net flow through the cut set, f = f, — fo, where f, 
is the sum of the flows through the forward edges and f5 (= 0) is the sum of the flows 
through the backward edges of the cut set. Thus f = f,;. Now f, cannot exceed the sum 
of the capacities of the forward edges; but this sum equals the capacity of the cut set, by 
definition. Together, f = cap (S, T), as asserted. (| 
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THEOREM 3 


PROOF 


THEOREM 4 


PROOF 
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Cut sets will now bring out the full importance of augmenting paths: 


Main Theorem. Augmenting Path Theorem for Flows 


A flow from s to t in a network G is maximum if and only if there does not exist a 
flow augmenting path s >t in G. 


(a) If there is a flow augmenting path P: s — f, we can use it to push through it an additional 
flow. Hence the given flow cannot be maximum. 


(b) On the other hand, suppose that there is no flow augmenting path srt in G. 
Let So be the set of all vertices i (including s) such that there is a flow augmenting 
path s — i, and let 7 be the set of the other vertices in G. Consider any edge (i, j) with 
i in Sg and j in 7p. Then we have a flow augmenting path s— i since i is in So, but 
s5—i—j] is not flow augmenting because j is not in Sp. Hence we must have 


Cij forward 
(6) fy = if Gi, j) isa edge of the path s ij. 
0) backward 


Otherwise we could use (i, j) to get a flow augmenting path s—>i—j. Now (So, 7o) 
defines a cut set (since fis in 7g; why?). Since by (6), forward edges are used to capacity 
and backward edges carry no flow, the net flow through the cut set (So, 79) equals the 
sum of the capacities of the forward edges, which is cap (So, Jo) by definition. This 
net flow equals the given flow f by Theorem |. Thus f = cap (So, 79). We also have 
f = cap (So, Io) by Theorem 2. Hence f must be maximum since we have reached 
equality. o 


The end of this proof yields another basic result (by Ford and Fulkerson, Canadian Journal 
of Mathematics 8 (1956), 399-404), namely, the so-called 


Max-Flow Min-Cut Theorem 


The maximum flow in any network G equals the capacity of a “minimum cut set” 
(= a cut set of minimum capacity) in G. 


We have just seen that f = cap (So, 79) for a maximum flow fand a suitable cut set (So, 7). 
Now by Theorem 2 we also have f = cap (S, T) for this f and any cut set (S, T) in G. 
Together, cap (So, 7o) = cap (S, T). Hence (So, 7p) is a minimum cut set. 

The existence of a maximum flow in this theorem follows for rational capacities from 
the algorithm in the next section and for arbitrary capacities from the Edmonds—Karp BFS 
also in that section. |_| 


The two basic tools in connection with networks are flow augmenting paths and cut sets. 
In the next section we show how flow augmenting paths can be used in an algorithm for 
maximum flows. 
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PROBLEM SET 23-6 


1-6| CUT SETS, CAPACITY 
Find 7 and cap (S, T) for: 


1. Fig. 498, § = {1, 2, 4, 5} 


. Fig. 499, S = {1, 2, 3} 
. Fig. 498, S = {1, 2, 3} 


2 

3 

4. Fig. 499, S = {1,2} 

5. Fig. 499, § = {1,2,4,5} 
6 


. Fig. 498, S = {1, 3,5} 


Fig. 499. Problems 2, 4, and 5 


7-8| MINIMUM CUT SET 


Find a minimum cut set and its capacity for the network: 


7. In Fig. 499 16-19| MAXIMUM FLOW 


Find the maximum flow by inspection: 


8. In Fig. 496. Verify that its capacity equals the maximum 


flow. 16. In Prob. 13 


9. Why are backward edges not considered in the 
definition of the capacity of a cut set? 


10. Incremental network. Sketch the network in Fig. 499, 
and on each edge (i, j) write c,j — fi; and f;;. Do you 
recognize that from this “incremental network” one can 
more easily see flow augmenting paths? 


11. Omission of edges. Which edges could be omitted 
from the network in Fig. 499 without decreasing the 
maximum flow? 


12-15| FLOW AUGMENTING PATHS 


Find flow augmenting paths: 


12. 


20. Find another maximum flow f = 15 in Prob. 19. 
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23.7 Maximum Flow: Ford—Fulkerson Algorithm 


Flow augmenting paths, as discussed in the last section, are used as the basic tool in the 
Ford—Fulkerson’ algorithm in Table 23.8 in which a given flow (for instance, zero flow in 
all edges) is increased until it is maximum. The algorithm accomplishes the increase by a 
stepwise construction of flow augmenting paths, one at a time, until no further such paths 
can be constructed, which happens precisely when the flow is maximum. 

In Step 1, an initial flow may be given. In Step 3, a vertex j can be labeled if there is 
an edge (i, j) with i labeled and 


Gy fe (“forward edge’’) 
or if there is an edge (j, i) with i labeled and 
fii > 9 (“backward edge’’). 


To scan a labeled vertex i means to label every unlabeled vertex j adjacent to i that can be 
labeled. Before scanning a labeled vertex i, scan all the vertices that got labeled before i. 
This BFS (Breadth First Search) strategy was suggested by Edmonds and Karp in 1972 
(Journal of the Association for Computing Machinery 19, 248-64). It has the effect that one 
gets shortest possible augmenting paths. 


Table 23.8 Ford—Fulkerson Algorithm for Maximum Flow 
Canadian Journal of Mathematics 9 (1957), 210-218 


ALGORITHM FORD-FULKERSON 
[G = (V, E), vertices 1 (= s),---,n (= 0), edges (i, 7), cj] 


This algorithm computes the maximum flow in a network G with source s, sink f, and 
capacities c;; > O of the edges (i, /). 

INPUT: n, 5 = 1, t = n, edges (i, j) of G, cj 

OUTPUT: Maximum flow f in G 

1. Assign an initial flow f;; (for instance, f,; = 0 for all edges), compute f. 

2. Label s by @. Mark the other vertices “unlabeled.” 


3. Find a labeled vertex i that has not yet been scanned. Scan i as follows. For every 
unlabeled adjacent vertex j, if c;; > fj;, compute 


A 


ij ifi=1 
Aiy = Cij — fi and A; = 
and label j with a “forward label” a, j); or if fy, > 0, compute 
A; = min (Aj, fii) 


and label j by a “backward label” (i~, Aj). 


“LESTER RANDOLPH FORD Jr. (1927- ) and DELBERT RAY FULKERSON (1924-1976), American 
mathematicians known for their pioneering work on flow algorithms. 
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If no such j exists then OUTPUT f. Stop 
[f is the maximum flow.] 
Else continue (that is, go to Step 4). 
4. Repeat Step 3 until f is reached. 
[This gives a flow augmenting path P: s > t.] 
If it is impossible to reach ¢ then OUTPUT f. Stop 
Lf is the maximum flow.] 
Else continue (that is, go to Step 5). 
5. Backtrack the path P, using the labels. 
6. Using P, augment the existing flow by A,. Set f= f + Ay. 
7. Remove all labels from vertices 2,---, n. Go to Step 3. 
End FORD-FULKERSON 


EXAMPLE 1. Ford—Fulkerson Algorithm 


Applying the Ford—Fulkerson algorithm, determine the maximum flow for the network in Fig. 500 (which is 
the same as that in Example 1, Sec. 23.6, so that we can compare). 


Solution. The algorithm proceeds as follows. 
1. An initial flow f = 9 is given. 


2. Label s (= 1) by @. Mark 2, 3, 4, 5, 6 “unlabeled.” 


7,4 


Fig. 500. Network in Example 1 with capacities (first numbers) and given flow 


3. Scan 1. 
Compute Ayo = 20 — 5 = 15 = Ag. Label 2 by (1*, 15). 
Compute Ayqg = 10 — 4 = 6 = Ag. Label 4 by (1°, 6). 
4. Scan 2. 
Compute Agg = 11 — 8 = 3, Ag = min (Ag, 3) = 3. Label 3 by (2°, 3). 
Compute As = min (Ag, 3) = 3. Label 5 by (2°, 3). 
Scan 3. 
Compute Agg = 13 — 6 = 7, Ag = A; = min (Ag, 7) = 3. Label 6 by (3°, 3). 


P:1—2-—3— 6(= 12) is a flow augmenting path. 


6. A; = 3. Augmentation gives fig = 8, fog = 11, fg6 = 9, other fj; unchanged. Augmented flow 


f=9+3= 12. 
7. Remove labels on vertices 2,---, 6. Go to Step 3. 
3. Scan 1. 


Compute Ayy = 20 — 8 = 12 = Ag. Label 2 by (1°, 12). 
Compute Ayqg = 10 — 4 = 6 = Ag. Label 4 by (1*, 6). 
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4. Scan 2. 
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Compute As = min (Ag, 3) = 3. Label 5 by (27, 3). 


Scan 4. [No vertex left for labeling.] 


Scan 5. 


Compute A3 = min (As, 2) = 2. Label 3 by (5, 2). 


Scan 3. 


Compute Agg = 13 — 9 = 4, Ag = min (A3, 4) = 2. Label 6 by (3°, 2). 


5. P:1—2-—5-—3-—6( Pf) isa flow augmenting path. 


6. A; = 2. Augmentation gives fj2 = 10, fg2 = 1, f35 = 0, fs6 = 11, other fj; unchanged. Augmented 
flow f=12+2= 14. 
7. Remove labels on vertices 2,---, 6. Go to Step 3. 


One can now scan | and then scan 2, as before, but in scanning 4 and then 5 one finds that no vertex is left for 


labeling. Thus one can no longer reach ¢. Hence the flow obtained (Fig. 501) is maximum, in agreement with 


our result in the last section. 


Fig. 501. 


PROBLEM SET 23-7 


. Do the computations indicated near the end of Exam- 
ple 1 in detail. 

. Solve Example 1 by Ford—Fulkerson with initial flow 0. 
Is it more work than in Example 1? 

. Which are the “bottleneck” edges by which the flow in 
Example 1 is actually limited? Hence which capacities 
could be decreased without decreasing the maximum 
flow? 

. What is the (simple) reason that Kirchhoff’s law is 
preserved in augmenting a flow by the use of a flow 
augmenting path? 

. How does Ford—Fulkerson prevent the formation of 
cycles? 


6-9 


MAXIMUM FLOW 


Maximum flow in Example 1 


10 


11. 


12. 


13. 


14. 


15. 
16. 


Find the maximum flow by Ford-Fulkerson: 


6. In Prob. 12, Sec. 23.6 
7. In Prob. 15, Sec. 23.6 
8. In Prob. 14, Sec. 23.6 


17. 


18. 


. Integer flow theorem. Prove that, if the capacities in 
a network G are integers, then a maximum flow exists 
and is an integer. 


CAS PROBLEM. Ford-Fulkerson. Write a program 
and apply it to Probs. 6-9. 


How can you see that Ford—Fulkerson follows a BFS 
technique? 

Are the consecutive flow augmenting paths produced 
by Ford—Fulkerson unique? 


If the Ford—Fulkerson algorithm stops without reach- 
ing t, show that the edges with one end labeled and the 
other end unlabeled form a cut set (S, T) whose capacity 
equals the maximum flow. 


Find a minimum cut set in Fig. 500 and its capacity. 


Show that in a network G with all c;; = 1, the maximum 
flow equals the number of edge-disjoint paths s > tf. 


In Prob. 15, the cut set contains precisely all forward 
edges used to capacity by the maximum flow (Fig. 501). 
Is this just by chance? 


Show that in a network G with capacities all equal to 1, 
the capacity of a minimum cut set (S, T) equals the 
minimum number g of edges whose deletion destroys 
all directed paths st. (A directed path v > w is a 
path in which each edge has the direction in which it is 
traversed in going from vu to w.) 
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19. Several sources and sinks. If a network has several 
sources 51,°**, 5, show that it can be reduced to the 
case of a single-source network by introducing a new 
vertex s and connecting s to 51,°--, 5% by k edges of 
capacity ©. Similarly if there are several sinks. Illustrate 
this idea by a network with two sources and two sinks. 


20. Find the maximum flow in the network in Fig. 502 with Fig. 502. Problem 20 
two sources (factories) and two sinks (consumers). 


23.8 Bipartite Graphs. Assignment Problems 


DEFINITION 


From digraphs we return to graphs and discuss another important class of combinatorial 
optimization problems that arises in assignment problems of workers to jobs, jobs to 
machines, goods to storage, ships to piers, classes to classrooms, exams to time periods, 
and so on. To explain the problem, we need the following concepts. 

A bipartite graph G = (V, E) is a graph in which the vertex set V is partitioned into 
two sets S and T (without common elements, by the definition of a partition) such that 
every edge of G has one end in S and the other in 7. Hence there are no edges in G that 
have both ends in S or both ends in T. Such a graph G = (V,£) is also written 
G =(S,T; E). 

Figure 503 shows an illustration. V consists of seven elements, three workers a, b, c, 
making up the set S, and four jobs 1, 2, 3, 4, making up the set 7. The edges indicate that 
worker a can do the jobs 1 and 2, worker b the jobs 1, 2, 3, and worker c the job 4. The 
problem is to assign one job to each worker so that every worker gets one job to do. This 
suggests the next concept, as follows. 


Maximum Cardinality Matching 


A matching in G = (5S, T; E) is a set M of edges of G such that no two of them 
have a vertex in common. If M consists of the greatest possible number of edges, 
we call it a maximum cardinality matching in G. 


For instance, a matching in Fig. 503 is My = {(a, 2), (b, 1)}. Another is Mz = {(a, 1), 
(b, 3), (c, 4)}; obviously, this is of maximum cardinality. 


S E 
1 


c PT 
4 
Fig. 503. Bipartite graph in the assignment of a set S = {a, b, c} 
of workers to a set T = {I, 2, 3, 4} of jobs 


A vertex U is exposed (or not covered) by a matching M if v is not an endpoint of an 
edge of M. This concept, which always refers to some matching, will be of interest when 
we begin to augment given matchings (below). If a matching leaves no vertex exposed, 
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THEOREM 1 


PROOF 
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we call it a complete matching. Obviously, a complete matching can exist only if S and 
T consist of the same number of vertices. 

We now want to show how one can stepwise increase the cardinality of a matching M 
until it becomes maximum. Central in this task is the concept of an augmenting path. 


An alternating path is a path that consists alternately of edges in M and not in M 
(Fig. 504A). An augmenting path is an alternating path both of whose endpoints (a and b 
in Fig. 504B) are exposed. By dropping from the matching M the edges that are on an 
augmenting path P (two edges in Fig. 504B) and adding to M the other edges of P (three 
in the figure), we get a new matching, with one more edge than M. This is how we use 
an augmenting path in augmenting a given matching by one edge. We assert that this 
will always lead, after a number of steps, to a maximum cardinality matching. Indeed, 
the basic role of augmenting paths is expressed in the following theorem. 


i la 


(A) Alternating path 


eet 
a 
(B) Augmenting path P 


Fig. 504. Alternating and augmenting paths. 
Heavy edges are those belonging to a matching M 


Augmenting Path Theorem for Bipartite Matching 


A matching M in a bipartite graph G = (S, T; E) is of maximum cardinality if and 
only if there does not exist an augmenting path P with respect to M. 


(a) We show that if such a path P exists, then M is not of maximum cardinality. Let P have 
q edges belonging to M. Then P has g + | edges not belonging to M. (In Fig. 504B we 
have q = 2.) The endpoints a and b of P are exposed, and all the other vertices on P are 
endpoints of edges in M, by the definition of an alternating path. Hence if an edge of M is 
not an edge of P, it cannot have an endpoint on P since then M would not be a matching. 
Consequently, the edges of M not on P, together with the g + 1 edges of P not belonging 
to M form a matching of cardinality one more than the cardinality of M because we omitted 
q edges from M and added g + 1 instead. Hence M cannot be of maximum cardinality. 


(b) We now show that if there is no augmenting path for M, then M is of maximum 
cardinality. Let M* be a maximum cardinality matching and consider the graph H 
consisting of all edges that belong either to M or to M*, but not to both. Then it is possible 
that two edges of H have a vertex in common, but three edges cannot have a vertex in 
common since then two of the three would have to belong to M (or to M*), violating that 
M and M* are matchings. So every v in V can be in common with two edges of H or with 
one or none. Hence we can characterize each “component” (= maximal connected subset) 
of H as follows. 


(A) A component of H can be a closed path with an even number of edges (in the case 
of an odd number, two edges from M or two from M* would meet, violating the matching 
property). See (A) in Fig. 505. 
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(B) A component of H can be an open path P with the same number of edges from M 
and edges from M*, for the following reason. P must be alternating, that is, an edge of 
M is followed by an edge of M*, etc. (since M and M* are matchings). Now if P had an 
edge more from M*, then P would be augmenting for M [see (B2) in Fig. 505], 
contradicting our assumption that there is no augmenting path for M. If P had an edge 
more from M, it would be augmenting for M* [see (B3) in Fig. 505], violating the 
maximum cardinality of M*, by part (a) of this proof. Hence in each component of H, the 
two matchings have the same number of edges. Adding to this the number of edges that 
belong to both M and M* (which we left aside when we made up H), we conclude that 
M and M* must have the same number of edges. Since M* is of maximum cardinality, 
this shows that the same holds for M, as we wanted to prove. | 


o s 
o* SS === Edge from M 


ea ie a == == Edge from M* 


(BL) Qe Qe: eG === e———8_~_s (Possible) 
(B2) @=-=-- ©" --- ©" --=-8 (Augmenting for M) 


(B3) @=——@ -=--6@—— 6 ---»- 6 (Augmenting for M*) 
Fig. 505. Proof of the augmenting path theorem for bipartite matching 


This theorem suggests the algorithm in Table 23.9 for obtaining augmenting paths, in 
which vertices are labeled for the purpose of backtracking paths. Such a label is in 
addition to the number of the vertex, which is also retained. Clearly, to get an augmenting 
path, one must start from an exposed vertex, and then trace an alternating path until one 
atrives at another exposed vertex. After Step 3 all vertices in S are labeled. In Step 4, 
the set T contains at least one exposed vertex, since otherwise we would have stopped 
at Step 1. 


Table 23.9 Bipartite Maximum Cardinality Matching 


ALGORITHM MATCHING [G = (5S, T; E), M, n] 
This algorithm determines a maximum cardinality matching M in a bipartite graph G by 
augmenting a given matching in G. 
INPUT: Bipartite graph G = (S, 7; E) with vertices 1, -- - ,n, matching M in G (for 
instance, M = @) 
OUTPUT: Maximum cardinality matching M in G 
1. If there is no exposed vertex in S then 


OUTPUT M. Stop 
[M is of maximum cardinality in G.] 


Else label all exposed vertices in S with ©. 
2. For each i in S and edge (i, j) not in M, label j with i, unless already labeled. 
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EXAMPLE 1 
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3. For each nonexposed j in T, label i with j, where i is the other end 
of the unique edge (i, j) in M. 

4. Backtrack the alternating path P ending on an exposed vertex in T 
by using the labels on the vertices. 

5. If no P in Step 4 is augmenting then 
OUTPUT M. Stop 
[M is of maximum cardinality in G.] 

Else augment M by using an augmenting path P. 

Remove all labels. 


Go to Step 1. 
End MATCHING 


Maximum Cardinality Matching 


Is the matching M in Fig. 506a of maximum cardinality? If not, augment it until maximum cardinality is reached. 


(a) Given graph (b) Matching M, 
and matching M, and new labels 


Fig. 506. Example 1 


Solution. We apply the algorithm. 
1. Label 1 and 4 with ©. 
2. Label 7 with 1. Label 5, 6, 8 with 3. 
3. Label 2 with 6, and 3 with 7. 
[All vertices are now labeled as shown in Fig. 506a.] 
4. Py: 1 —7-—3—S. [By backtracking, P, is augmenting.] 
Po: 1 — 7 —3 — 8. [P» is augmenting.] 


5. Augment M, by using P;, dropping (3, 7) from M, and including (1, 7) and (3, 5). Remove all labels. 
Go to Step 1. 


Figure 506b shows the resulting matching Mz = {(1, 7), (2, 6), (3, 5)}. 
Label 4 with @. 

Label 7 with 2. Label 6 and 8 with 3. 

Label 1 with 7, and 2 with 6, and 3 with 5. 


P3:5 — 3 — 8. [Ps is alternating but not augmenting.] 


A a a 


Stop. Mg is of maximum cardinality (namely, 3). | 
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PROBLEM SET 23-8 


1-7| BIPARTITE OR NOT? 


If you answer is yes, find S and T: 


8. Can you obtain the answer to Prob. 3 from that to 


Prob. 1? 


9. Can you obtain a bipartite subgraph in Prob. 4 by 
omitting two edges? Any two edges? Any two edges 


without a common vertex? 


10-12} MATCHING. AUGMENTING PATHS 


Find an augmenting path: 
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13-15| MAXIMUM CARDINALITY MATCHING 


Using augmenting paths, find a maximum cardinality 
matching: 


13. 
14. 
15. 
16. 


17. 


18. 


In Prob. 11 

In Prob. 10 

In Prob. 12 

Complete bipartite graphs. A bipartite graph 
G = (S, T; E) is called complete if every vertex in S is 
joined to every vertex in T by an edge, and is denoted 
by Kno» Where n and ng are the numbers of vertices 
in S and T, respectively. How many edges does this 
graph have? 

Planar graph. A planar graph is a graph that can be 
drawn on a sheet of paper so that no two edges cross. 
Show that the complete graph K4 with four vertices is 
planar. The complete graph Ks with five vertices is not 
planar. Make this plausible by attempting to draw Ks 
so that no edges cross. Interpret the result in terms of 
a net of roads between five cities. 


Bipartite graph K3 3 not planar. Three factories 1, 
2, 3 are each supplied underground by water, gas, and 
electricity, from points A, B, C, respectively. Show that 
this can be represented by K3 3 (the complete bipartite 
graph G = (S, T; E) with S and T consisting of three 
vertices each) and that eight of the nine supply lines 
(edges) can be laid out without crossing. Make it 
plausible that K3 3 is not planar by attempting to draw 
the ninth line without crossing the others. 


19-25| VERTEX COLORING 


19. 


Vertex coloring and exam scheduling. What is the 
smallest number of exam periods for six subjects a, b, 
c, d, e, f if some of the students simultaneously take a, 
b, f, some c, d, e, some a, c, e, and some c, e? Solve 
this as follows. Sketch a graph with six vertices a,:--,f 
and join vertices if they represent subjects simul- 
taneously taken by some students. Color the vertices 
so that adjacent vertices receive different colors. (Use 
numbers 1, 2,--- instead of actual colors if you want.) 
What is the minimum number of colors you need? For 
any graph G, this minimum number is called the 
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20. 


(vertex) chromatic number X,(G). Why is this the 
answer to the problem? Write down a _ possible 
schedule. 


Scheduling and matching. Three teachers x1, xo, x3 
teach four classes y1, ye, v3, Ya for these numbers of 
periods: 


21. 


22. 


J1 v2 ¥3 ya 
X1 1 0 1 1 
X92 1 1 1 1 
x3 0 1 1 1 


Show that this arrangement can be represented by a 
bipartite graph G and that a teaching schedule for one 
period corresponds to a matching in G. Set up a 
teaching schedule with the smallest possible number of 
periods. 

How many colors do you need for vertex coloring any 
tree? 


Harbor management. How many piers does a harbor 
master need for accommodating six cruise ships 
Sy,°++, Sg with expected dates of arrival A and departure 
D in July, (A, D) = (10,13), (13,15), (14, 17), 
(12, 15), (16, 18), (14, 17), respectively, if each pier can 


23. 


24. 


25. 


26. 
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accommodate only one ship, arrival being at 6 am and 
departures at 11 pm? Hint. Join S; and S; by an edge if 
their intervals overlap. Then color vertices. 
What would be the answer to Prob. 22 if only the five 
ships $1,---, 55 had to be accommodated? 


Four- (vertex) color theorem. The famous four-color 
theorem states that one can color the vertices of any 
planar graph (so that adjacent vertices get different 
colors) with at most four colors. It had been conjectured 
for a long time and was eventually proved in 1976 by 
Appel and Haken [Illinois J. Math 21 (1977), 429-567]. 
Can you color the complete graph Ks with four colors? 
Does the result contradict the four-color theorem? (For 
more details, see Ref. [Fl] in App. 1.) 


Find a graph, as simple as possible, that cannot be 
vertex colored with three colors. Why is this of interest 
in connection with Prob. 24? 


Edge coloring. The edge chromatic number X,(G) of 
a graph G is the minimum number of colors needed for 
coloring the edges of G so that incident edges get 
different colors. Clearly, X.(G) 2 max d(u), where d(u) 
is the degree of vertex u. If G = (S, T; E) is bipartite, 
the equality sign holds. Prove this for Ky», the complete 
(cf. Sec. 23.1) bipartite graph G = (S, T, E) with S and 
T consisting of n vertices each. 


CHAPTER 23 REVIEW QUESTIONS AND PROBLEMS 


. What is a graph, a digraph, a cycle, a tree? 


. State some typical problems that can be modeled and 


solved by graphs or digraphs. 


. State from memory how graphs can be handled on 


computers. 


4. What is a shortest path problem? Give applications. 


10. 


. What situations can be handled in terms of the traveling 


salesman problem? 


. Give typical applications involving spanning trees. 
. What are the basic ideas and concepts in handling flows? 


. What is combinatorial optimization? Which sections of 


this chapter involved it? Explain details. 


. Define bipartite graphs and describe some typical 


applications of them. 


What is BFS? DFS? In what connection did these 
concepts occur? 


11-16 


MATRICES FOR GRAPHS AND DIGRAPHS 


Find the adjacency matrix of: 


11. 


OO 
ao 


12. 13. 
x 
CaO 
14-16} Sketch the graph whose adjacency matrix is: 
fo 1 1 1 0 1 0 1 
1 0 1 1 1 0 0 1 
14. 15. 
1 1 0 1 0 0 0 1 
1 1 1 0 1 1 1 0 


16. 


17. 


0 0 
1 0 0 1 


1 1 1 0 


Vertex incidence list. Make it for the graph in Prob. 15. 
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18. Find a shortest path and its length by Moore’s BFS 22. Find flow augmenting paths and the maximum flow. 
algorithm, assuming that all the edges have length 1. 


Problem 18 Problem 22 


19. Find shortest paths by Dijkstra’s algorithm. 23. Using augmenting paths, find a maximum cardinality 


matching. 
a a 
GO —® 
Problem 19 6) 6) 
20. Find a shortest spanning tree. 
@ (8) 
Problem 25 


24. Find an augmenting path, 


Problem 20 


21. Company A has offices in Chicago, Los Angeles, and 
New York; Company B in Boston and New York; 
Company C in Chicago, Dallas, and Los Angeles. 
Represent this by a bipartite graph. Problem 24 


SUMMARY-OF-CHAPTER-23 


Graphs. Combinatorial Optimization 


Combinatorial optimization concerns optimization problems of a discrete or 
combinatorial structure. It uses graphs and digraphs (Sec. 23.1) as basic tools. 

A graph G = (V, E) consists of a set V of vertices U1, V2,°*+, Un (often simply 
denoted by 1, 2,---,) and a set E of edges e}, e9,°-+, em, each of which connects 
two vertices. We also write (i, j) for an edge with vertices i and j as endpoints. A 
digraph (= directed graph) is a graph in which each edge has a direction (indicated 
by an arrow). For handling graphs and digraphs in computers, one can use matrices 
or lists (Sec. 23.1). 

This chapter is devoted to important classes of optimization problems for graphs 
and digraphs that all arise from practical applications, and corresponding algorithms, 
as follows. 
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In a shortest path problem (Sec. 23.2) we determine a path of minimum length 
(consisting of edges) from a vertex s to a vertex f in a graph whose edges (i, j) have 
a “length” J;; > 0, which may be an actual length or a travel time or cost or an 
electrical resistance [if (i, 7) is a wire in a net], and so on. Dijkstra’s algorithm 
(Sec. 23.3) or, when all /;; = 1, Moore’s algorithm (Sec. 23.2) are suitable for these 
problems. 

A tree is a graph that is connected and has no cycles (no closed paths). Trees are 
very important in practice. A spanning tree in a graph G is a tree containing all the 
vertices of G. If the edges of G have lengths, we can determine a shortest spanning 
tree, for which the sum of the lengths of all its edges is minimum, by Kruskal’s 
algorithm or Prim’s algorithm (Secs. 23.4, 23.5). 

A network (Sec. 23.6) is a digraph in which each edge (i, j) has a capacity 
cj > O[= maximum possible flow along (i, j)] and at one vertex, the source s, a 
flow is produced that flows along the edges to a vertex t¢, the sink or target, where 
the flow disappears. The problem is to maximize the flow, for instance, by applying 
the Ford-Fulkerson algorithm (Sec. 23.7), which uses flow augmenting paths 
(Sec. 23.6). Another related concept is that of a cut set, as defined in Sec. 23.6. 

A bipartite graph G = (V, E) (Sec. 23.8) is a graph whose vertex set V consists 
of two parts S and T such that every edge of G has one end in S and the other in T, 
so that there are no edges connecting vertices in S or vertices in 7. A matching in 
G is a set of edges, no two of which have an endpoint in common. The problem 
then is to find a maximum cardinality matching in G, that is, a matching M that 
has a maximum number of edges. For an algorithm, see Sec. 23.8. 


CHAPTER 24 
CHAPTER 25 


PART G 


Probability, 
Statistics 


Data Analysis. Probability Theory 
Mathematical Statistics 


Probability theory (Chap. 24) provides models of probability distributions (theoretical 
models of the observable reality involving chance effects) to be tested by statistical methods, 
and it will also supply the mathematical foundation of these methods in Chap. 25. 


Modern mathematical statistics (Chap. 25) has various engineering applications, for 
instance, in testing materials, control of production processes, quality control of production 
outputs, performance tests of systems, robotics, and automatization in general, production 
planning, marketing analysis, and so on. 


To this we could add a long list of fields of applications, for instance, in agriculture, 
biology, computer science, demography, economics, geography, management of natural 
resources, medicine, meteorology, politics, psychology, sociology, traffic control, urban 
planning, etc. Although these applications are very heterogeneous, we shall see that most 
statistical methods are universal in the sense that each of them can be applied in various 
fields. 


Additional Software for 
Probability and Statistics 


See also the list of software at the beginning of Part E on Numerical Analysis. 
Data Desk. Data Description, Inc., Ithaca, NY. Phone 1-800-573-5121 or (607) 257-1000, 
website at www.datadesk.com. 


1009 


1010 


PART G_ Probability, Statistics 


MINITAB. Minitab, Inc., State College, PA. Phone 1-800-448-3555 or (814) 238-3280, 
website at www.minitab.com. 

SAS. SAS Institute, Inc., Cary, NC. Phone 1-800-727-0025 or (919) 677-8000, website 
at Www.sas.com. 

R. website at www.r-project.org. Free software, part of the GNU/Free Software Foundation 
project. 

SPSS. SPSS, Inc., Chicago, IL. (part of IBM) Phone 1-800-543-2185 or (312) 651-3000, 
website at www.spss.com. 

STATISTICA. StatSoft, Inc., Tulsa, OK. Phone (918) 749-1119, website at 
www.statsoft.com. 

TIBCO Spotfire S+. TIBCO Software Inc., Palo Alto, CA; Office for this software: 
Somerville, MA. Phone 1-866-240-0491 (toll-free), (617) 702-1602, website at spotfire. 
tibco.com/products/s-plus/statistical-analysis-software.aspx 


CHAPTER 2 4 


Data Analysis. 
Probability Theory 


We first show how to handle data numerically or in terms of graphs, and how to extract 
information (average size, spread of data, etc.) from them. If these data are influenced by 
“chance,” by factors whose effect we cannot predict exactly (e.g., weather data, stock 
prices, life spans of tires, etc.), we have to rely on probability theory. This theory 
originated in games of chance, such as flipping coins, rolling dice, or playing cards. 
Nowadays it gives mathematical models of chance processes called random experiments 
or, briefly, experiments. In such an experiment we observe a random variable X, that 
is, a function whose values in a trial (a performance of an experiment) occur “by chance” 
(Sec. 24.3) according to a probability distribution that gives the individual probabilities 
with which possible values of X may occur in the long run. (Example: Each of the six 
faces of a die should occur with the same probability, 1/6.) Or we may simultaneously 
observe more than one random variable, for instance, height and weight of persons or 
hardness and tensile strength of steel. This is discussed in Sec. 24.9, which will also give 
the basis for the mathematical justification of the statistical methods in Chapter 25. 


Prerequisite: Calculus. 
References and Answers to Problems: App. 1 Part G, App. 2. 
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EXAMPLE 1 


Data can be represented numerically or graphically in various ways. For instance, your 
daily newspaper may contain tables of stock prices and money exchange rates, curves or 
bar charts illustrating economical or political developments, or pie charts showing how 
your tax dollar is spent. And there are numerous other representations of data for special 
purposes. 

In this section we discuss the use of standard representations of data in statistics. (For 
these, software packages, such as DATA DESK, R, and MINITAB, are available, and 
Maple or Mathematica may also be helpful; see pp. 789 and 1009) We explain corresponding 
concepts and methods in terms of typical examples. 


Recording and Sorting 


Sample values (observations, measurements) should be recorded in the order in which they occur. Sorting, that 
is, ordering the sample values by size, is done as a first step of investigating properties of the sample and graphing 
it. Sorting is a standard process on the computer; see Ref. [E35], listed in App. 1. 
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Super alloys is a collective name for alloys used in jet engines and rocket motors, requiring high temperature 
(typically 1800°F), high strength, and excellent resistance to oxidation. Thirty specimens of Hastelloy C (nickel- 
based steel, investment cast) had the tensile strength (in 1000 Ib/sq in.), recorded in the order obtained and 
rounded to integer values, 


89 77 88 91 88 93 99 79 87 84 86 82 88 89 78 
(1) 
90 91 81 90 83 83 92 87 89 86 89 81 87 84 89 


Sorting gives 


77 78 79 81 81 82 83 83 84 84 86 86 87 87 87 il 
(2) 
88 88 88 89 89 89 89 89 90 90 91 91 92 93 99 


Graphic Representation of Data 


We shall now discuss standard graphic representations used in statistics for obtaining 
information on properties of data. 


Stem-and-Leaf Plot (Fig. 507) 


This is one of the simplest but most useful representations of data. For (1) it is shown in Fig. 507. The numbers 
in (1) range from 78 to 99; see (2). We divide these numbers into 5 groups, 75-79, 80-84, 85-89, 90-94, 
95-99. The integers in the tens position of the groups are 7, 8, 8, 9, 9. These form the stem in Fig. 507. The 
first leaf is 789, representing 77, 78, 79. The second leaf is 1123344, representing 81, 81, 82, 83, 83, 84, 84. 
And so on. 

The number of times a value occurs is called its absolute frequency. Thus 78 has absolute frequency 1, the 
value 89 has absolute frequency 5, etc. The column to the extreme left in Fig. 507 shows the cumulative absolute 
frequencies, that is, the sum of the absolute frequencies of the values up to the line of the leaf. Thus, the number 
10 in the second line on the left shows that (1) has 10 values up to and including 84. The number 23 in the next 
line shows that there are 23 values not exceeding 89, etc. Dividing the cumulative absolute frequencies by 
n (= 30 in Fig. 507) gives the cumulative relative frequencies 0.1, 0.33, 0.76, 0.93, 1.00. i] 


Histogram (Fig. 508) 


For large sets of data, histograms are better in displaying the distribution of data than stem-and-leaf plots. The 
principle is explained in Fig. 508. (An application to a larger data set is shown in Sec. 25.7). The bases of the 
rectangles in Fig. 508 are the x-intervals (known as class intervals) 74.5—79.5, 79.5—84.5, 84.5-89.5, 89.5—94.5, 
94.5-99.5, whose midpoints (known as class marks) are x = 77, 82, 87, 92, 97, respectively. The height of a 
rectangle with class mark x is the relative class frequency f,.(x), defined as the number of data values in that 
class interval, divided by n (= 30 in our case). Hence the areas of the rectangles are proportional to these 
relative frequencies, 0.10, 0.23, 0.43, 0.17, 0.07, so that histograms give a good impression of the distribution 


of data. | 
hr 
0.5 = 
Leaf unit = 1.0 0.4- 
3 7| 789 0.3- 
10 8 | 1123344 0.2F 
23 8 | 6677788899999 call 
29 9}001123 
30 919 a 77 82 87 92 97° x 
Fig. 507. Stem-and-leaf plot Fig. 508. Histogram of the data in 


of the data in Example 1 Example 1 (grouped as in Fig. 507) 


SEC. 24.1 Data Representation. Average. Spread 1013 


EXAMPLE 4 


Boxplot. Median. Interquartile Range. Outlier 


A boxplot of a set of data illustrates the average size and the spread of the values, in many cases the two most 
important quantities characterizing the set, as follows. 

The average size is measured by the median, or middle quartile, qy. If the number n of values of the set is odd, 
then gy is the middlemost of the values when ordered as in (2). If n is even, then gy is the average of the two 
middlemost values of the ordered set. In (2) we have n = 30 and thus gy 4x45 + X16) 3 (87 + 88) = 87.5. 
(In general, qj will be a fraction if n is even.) 

The spread of values can be measured by the range R = Xmax — Xmin. the largest value minus the smallest 


one. 
Better information on the spread gives the interquartile range IQR = gy — qr. Here gy is the middlemost 
value (or the average of the two middlemost values) in the data above the median; and gz, is the middlemost 
value (or the average of the two middlemost values) in the data below the median. Hence in (2) we have 
qu = X23 = 89, qr = Xg = 83, and IQR = 89 — 83 = 6. 
The box in Fig. 509 extends vertically from gy, to qy; it has height IQR = 6. The vertical lines below and 
above the box extend from x min = 77 to Xmax = 99, so that they show R = 22. 


100 }— 

95;—- 
90 — dy 
Gy 

85 /— 
Gq, 

80;- 

La 


| 
Data set (1) 


Fig. 509. Boxplot of the data set (1) 


The line above the box is suspiciously long. This suggests the concept of an outlier, a value that is more 
than 1.5 times the IQR away from either end of the box; here 1.5 is purely conventional. An outlier indicates 
that something might have gone wrong in the data collection. In (2) we have 89 + 1.5 IQR = 98, and we regard 
99 as an outlier. a 


Mean. Standard Deviation. Variance. 
Empirical Rule 


Medians and quartiles are easily obtained by ordering and counting, practically without 
calculation. But they do not give full information on data: you can change data values to 
some extent without changing the median. Similarly for the quartiles. 

The average size of the data values can be measured in a more refined way by the 
mean 


(3) F=4 Sap =s 01 tay tt xp). 
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This is the arithmetic mean of the data values, obtained by taking their sum and dividing 
by the data size n. Thus in (1), 


X = 39 (89 + 77 +--+ + 89) = 28° = 86.7. 


Every data value contributes, and changing one of them will change the mean. 
Similarly, the spread (variability) of the data values can be measured in a more refined 
way by the standard deviation s or by its square, the variance 


n 


1 E 
—+ j Oe 5 ler = 2 + + Gn — DI. 
Z 


4 fee 


Thus, to obtain the variance of the data, take the difference x; — x of each data value from 
the mean, square it, take the sum of these n squares, and divide it by n — | (not n, as we 
motivate in Sec. 25.2). To get the standard deviation s, take the square root of s7, 

For example, using ¥ = 260/3, we get for the data (1) the variance 


s? = gy [(89 — 289)? + (77 — 280)? + -.- + (89 — 260)2] = 2006 — 93.06 


Hence the standard deviation is s = V2006/87 ~ 4.802. Note that the standard deviation 
has the same dimension as the data values (kg/ mm2?, see at the beginning), which is an 
advantage. On the other hand, the variance is preferable to the standard deviation in 
developing statistical methods, as we shall see in Chap. 25. 


CAUTION! Your CAS (Maple, for instance) may use 1/n instead of 1/(n — 1) in (4), 
but the latter is better when 7 is small (see Sec. 25.2). 


Mean and standard deviation, introduced to give center and spread, actually give much 
more information according to this rule. 


Empirical Rule. For any mound-shaped, nearly symmetric distribution of data the intervals 


x ts, x + 2s 


xX 35 contain about 68%, 95%, 99.7%, 


> 


respectively, of the data points. 


Empirical Rule and Outliers. z-Score 


For (1), with x = 86.7 and s = 4.8, the three intervals in the Rule are 81.9Sx591.5, 77.1 Sx = 96.3, 
72.3 Sx =101.1 and contain 73% (22 values remain, 5 are too small, and 5 too large), 93% (28 values, 
1 too small, and | too large), and 100%, respectively. 
If we reduce the sample by omitting the outlier 99, mean and standard deviation reduce toX;eq = 86.2, Speq = 4.3, 
approximately, and the percentage values become 67% (5 and 5 values outside), 93% (1 and | outside), and 100%. 
Finally, the relative position of a value x in a set of mean x and standard deviation s can be measured by the 
z-score 


This is the distance of x from the mean X measured in multiples of s. For instance, z(83) = (83 — 86.7)/ 
4.8 = —0.77. This is negative because 83 lies below the mean. By the Empirical Rule, the extreme z-values 
are about —3 and 3. a 
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PROBLEM SET 2471 


1-10| DATA REPRESENTATIONS 8. Foundrax test of Brinell hardness (2.5 mm steel ball, 
Represent the data by a stem-and-leaf plot, a histogram, and 62.5 kg load, 30 sec) of 20 copper plates (values in 
a boxplot: kg/mm’) 

1. Length of nails [mm] 86 86 87 89 76 85 82 86 87 85 


90 88 89 90 88 80 84 89 90 89 
19 21 19 20 19 20 21 20 


9. Efficiency [%] of seven Voith Francis turbines of 


2. Phone calls per minute in an office between 9:00 A.M. runner diameter 2.3 m under a head range of 185 m 


and 9:10 AM. 


91.8 89.1 89.9 92.5 90.7 91.2 91.0 
6 6 4 2 17 0 4 6 7 


P pass é 10. —0.51 0.12 —047 0.95 0.25 —-0.18 —0.54 
. Systolic blood pressure of 15 female patients of ages 
20-22 11-16 | AVERAGE AND SPREAD 


156 158 154 133 141 130 144 137 Find the mean and compare it with the median. Find the 
151 146 156 138 138 149 139 standard deviation and compare it with the interquartile range. 


11. For the data in Prob. 1 
4. Iron content [%] of 15 specimens of hermatite (Fe2O3) 12. For the phone call data in Prob. 2 


72.8 70.4 71.2 69.2 70.3 68.9 71.1 69.8 13. For the medical data in Prob. 3 
71.5 69.7 70.5 71.3 69.1 70.9 70.6 14. For the iron contents in Prob. 4 


15. For the release times in Prob. 7 
16. For the Brinell hardness data in Prob. 8 


203. 199 198 201 200 201 201 17. Outlier, reduced data. Calculate s for the data 
4 1 3 10 2. Then reduce the data by deleting 
the outlier and calculate s. Comment. 


5. Weight of filled bags [g] in an automatic filling 


6. Gasoline consumption [miles per gallon, rounded] of 


six cars of the same model under similar conditions i : ; 
18. Outlier, reduction. Do the same tasks as in Prob. 17 


15.0 15.5 145 15.0 15.5 15.0 for the hardness data in Prob. 8. 

19. Construct the simplest possible data with x = 100 but 
du = 0. What is the point of this problem? 

13°12 14 15 13 13 14 11 15 14 20. Mean. Prove that x must always lie between the 

16 13 15 11 14 12 13 15 14 14 smallest and the largest data values. 


7. Release time [sec] of a relay 


24.2 Experiments, Outcomes, Events 


We now turn to probability theory. This theory has the purpose of providing mathematical 
models of situations affected or even governed by “chance effects,” for instance, in weather 
forecasting, life insurance, quality of technical products (computers, batteries, steel sheets, 
etc.), traffic problems, and, of course, games of chance with cards or dice. And the accuracy 
of these models can be tested by suitable observations or experiments—this is a main 
purpose of statistics to be explained in Chap. 25. 

We begin by defining some standard terms. An experiment is a process of measurement 
or observation, in a laboratory, in a factory, on the street, in nature, or wherever; so 
“experiment” is used in a rather general sense. Our interest is in experiments that involve 
randomness, chance effects, so that we cannot predict a result exactly. A trial is a single 
performance of an experiment. Its result is called an outcome or a sample point. 7 trials 
then give a sample of size n consisting of n sample points. The sample space S of an 
experiment is the set of all possible outcomes. 
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Random Experiments. Sample Spaces 


(1) Inspecting a lightbulb. S = {Defective, Nondefective}. 

(2) Rolling a die. S = {1, 2, 3, 4, 5, 6}. 

(3) Measuring tensile strength of wire. § the numbers in some interval. 

(4) Measuring copper content of brass. $: 50% to 90%, say. 

(5) Counting daily traffic accidents in New York. S the integers in some interval. 


(6) Asking for opinion about a new car model. S = {Like, Dislike, Undecided}. | 
The subsets of S are called events and the outcomes simple events. 


Events 


In (2), events are A = {1,3,5} (“Odd number”), B = {2, 4,6} (“Even number”), C = {5,6}. etc. Simple 
events are {1}, {2},---, {6}. el 


If, in a trial, an outcome a happens and a € A (a is an element of A), we say that A 
happens. For instance, if a die turns up a 3, the event A: Odd number happens. Similarly, 
if C in Example 7 happens (meaning 5 or 6 turns up), then, say, D = {4, 5, 6} happens. 
Also note that S happens in each trial, meaning that some event of S always happens. All 
this is quite natural. 


Unions, Intersections, Complements of Events 


In connection with basic probability laws we shall need the following concepts and facts 
about events (subsets) A, B, C,--- of a given sample space S. 

The union A U B of A and B consists of all points in A or B or both. 

The intersection A M B of A and B consists of all points that are in both A and B. 


If A and B have no points in common, we write 
ANB=@ 


where © is the empty set (set with no elements) and we call A and B mutually exclusive 
(or disjoint) because, in a trial, the occurrence of A excludes that of B (and conversely)— 
if your die turns up an odd number, it cannot turn up an even number in the same trial. 
Similarly, a coin cannot turn up Head and Tail at the same time. 

Complement A° of A. This is the set of all the points of S not in A. Thus, 


ANAS = ©, AUAS=S. 


In Example 7 we have AS = B, hence A U AS = {1, 2,3,4,5,6} = S. 

Another notation for the complement of A is A (instead of A‘), but we shall not 
use this because in set theory A is used to denote the closure of A (not needed in 
our work). 

Unions and intersections of more events are defined similarly. The union 


m 
at PLA ae 
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of events Aj,---,A,, consists of all points that are in at least one Aj. Similarly for the 
union A; U Ag U-:: of infinitely many subsets Aj, Ag,--- of an infinite sample space 
S (that is, S consists of infinitely many points). The intersection 


AIH 41 Ag Am 


of Ay,-++,Aj,, consists of the points of S that are in each of these events. Similarly for 
the intersection Ay M Ag ™ --- of infinitely many subsets of S. 

Working with events can be illustrated and facilitated by Venn diagrams’ for showing 
unions, intersections, and complements, as in Figs. 510 and 511, which are typical 
examples that give the idea. 


Unions and Intersections of 3 Events 
In rolling a die, consider the events 
A: Number greater than 3, B: Number less than 6, iC: Even number. 


Then AM B= {4,5}, BN C= {2,4}, CNA = {4,6},A A BO C = {4}. Can you sketch a Venn diagram 


of this? Furthermore, A U B = S, hence A U B U C = S (why?). Bi 
Ss Ss 
UnionA UB Intersection A 4 B 


Fig. 510. Venn diagrams showing two events A and B in a sample space S$ 
and their union A U B (colored) and intersection A M B (colored) 


x 


S 


Fig. 511. Venn diagram for the experiment of rolling a die, showing S, 
A = {1,3, 5}, C = (5, 6}, AUC = (1,3, 5, 6}, AN C = {5} 


PROBLEEM—SET 2-4-2 


1-12} SAMPLE SPACES, EVENTS 3. Rolling 2 dice 


Graph a sample space for the experiments: 4. Rolling a die until the first Six appears 


1. Drawing 3 screws from a lot of right-handed and left- 


handed screws 
2. Tossing 2 coins 


5. Tossing a coin until the first Head appears 


6. Recording the lifetime of each of 3 lightbulbs 


1JOHN VENN (1834-1923), English mathematician. 
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7. 


10. 


11. 
12. 
13. 


14. 


Recording the daily maximum temperature X and the 
daily maximum air pressure Y at Times Square in New 
York 


. Choosing a committee of 2 from a group of 5 people 


. Drawing gaskets from a lot of 10, containing one 


defective D, unitil D is drawn, one at a time and 
assuming sampling without replacement, that is, 
gaskets drawn are not returned to the lot. (More about 
this in Sec. 24.6) 


In rolling 3 dice, are the events A: Sum divisible by 3 
and B: Sum divisible by 5 mutually exclusive? 
Answer the questions in Prob. 10 for rolling 2 dice. 
List all 8 subsets of the sample space S$ = {a, b, c}. 
In Prob. 3 circle and mark the events A: Faces are equal, 
B: Sum of faces less than 5, A UB, AQ B, AS, B®. 

In drawing 2 screws from a lot of right-handed and 
left-handed screws, let A, B, C, D mean at a least 
1 right-handed, at least 1 left-handed, 2 right-handed, 
2 left-handed, respectively. Are A and B mutually 
exclusive? C and D? 


15-20 


VENN DIAGRAMS 


15. 


In connection with a trip to Europe by some students, 
consider the events P that they see Paris, G that they 
have a good time, and M that they run out of money, 
and describe in words the events 1,---,7 in the 
diagram. 


24.3 Probability 


The “probability” of an event A in an experiment is supposed to measure how frequently 
A is about to occur if we make many trials. If we flip a coin, then heads H and tails T 
will appear about equally often—we say that H and T are “equally likely.” Similarly, for 
a regularly shaped die of homogeneous material (“fair die”) each of the six outcomes 
1,---, 6 will be equally likely. These are examples of experiments in which the sample 
space S consists of finitely many outcomes (points) that for reasons of some symmetry 
can be regarded as equally likely. This suggests the following definition. 


DEFINITION 1 
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Problem 15 


16. Show that, by the definition of complement, for any 


subset A of a sample space S. 
(AS =A, S°=O, S=S, 
AUAS=S, ANAS = ©. 


17. Using a Venn diagram, show that A C B if and only if 


AUB=B. 


18. Using a Venn diagram, show that A C B if and only if 


ANB=A., 


19. (De Morgan’s laws) Using Venn diagrams, graph and 


check De Morgan’s laws 
(A U B)® = ASM BS 
(AN B) = AS UBS. 


20. Using Venn diagrams, graph and check the rules 


AU(BNO=(AUB)N(AUO 
AN(BUQ=(ANB)U(ANO. 


First Definition of Probability 


If the sample space S of an experiment consists of finitely many outcomes (points) 
that are equally likely, then the probability P(A) of an event A is 


Number of points in A 


(1) BU 


Number of points in S ” 
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From this definition it follows immediately that, in particular, 


(2) P(S) = 1. 


Fair Die 
In rolling a fair die once, what is the probability P(A) of A of obtaining a 5 or a 6? The probability of B: “Even 


number”? 


Solution. The six outcomes are equally likely, so that each has probability 1/6. Thus P(A) = 2/6 = 1/3 
because A = {5, 6} has 2 points, and P(B) = 3/6 = 1/2. 


Definition | takes care of many games as well as some practical applications, as we shall 
see, but certainly not of all experiments, simply because in many problems we do not 
have finitely many equally likely outcomes. To arrive at a more general definition of 
probability, we regard probability as the counterpart of relative frequency. Recall from 
Sec. 24.1 that the absolute frequency f(A) of an event A in x trials is the number of times 
A occurs, and the relative frequency of A in these trials is f(A)/n; thus 


_ f(A) _ Number of times A occurs 
n Number of trials 


(3) fi rel(A) 


Now if A did not occur, then f(A) = 0. If A always occurred, then f(A) = n. These are 
the extreme cases. Division by n gives 


(4*) 0 S frei(A) = 1. 


In particular, for A = S we have f(S) =n because S always occurs (meaning that 
some event always occurs; if necessary, see Sec. 24.2, after Example 7). Division 
by n gives 


(S*) frei(S) = 1. 


Finally, if A and B are mutually exclusive, they cannot occur together. Hence the absolute 
frequency of their union A U B must equal the sum of the absolute frequencies of A and 
B. Division by n gives the same relation for the relative frequencies, 


(6*) Fre(A U B) = fre(A) + frei(B) (AN B= ©). 


We are now ready to extend the definition of probability to experiments in which equally 
likely outcomes are not available. Of course, the extended definition should include 
Definition 1. Since probabilities are supposed to be the theoretical counterpart of relative 
frequencies, we choose the properties in (4*), (5*), (6*) as axioms. (Historically, such a 
choice is the result of a long process of gaining experience on what might be best and 
most practical.) 
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General Definition of Probability 


Given a sample space S, with each event A of S (subset of S) there is associated a 
number P(A), called the probability of A, such that the following axioms of 
probability are satisfied. 

1. For every A in S, 
(4) V=74)= 1, 

2. The entire sample space S has the probability 
(5) P(S) = 1. 

3. For mutually exclusive events A and B (A M B = ©; see Sec. 24.2), 


(6) P(A U B) = P(A) + P(B) (AN B=). 


If S is infinite (has infinitely many points), Axiom 3 has to be replaced by 
3’. For mutually exclusive events Ay, Ag,-:-, 


(6) P(A, U Ag U-:*) = P(Ay) + P(Ag) + °°. 


In the infinite case the subsets of S on which P(A) is defined are restricted to form a 
so-called o-algebra, as explained in Ref. [GenRef6] (not [G6]!) in App. 1. This is of no 
practical consequence to us. 


Basic Theorems of Probability 


We shall see that the axioms of probability will enable us to build up probability theory 
and its application to statistics. We begin with three basic theorems. The first of them 
is useful if we can get the probability of the complement A® more easily than P(A) 
itself. 


Complementation Rule 


For an event A and its complement A‘ in a sample space S, 


(7) P(A‘) = 1 — P(A). 


By the definition of complement (Sec. 24.2), we have S = A U AS and AN AS = ©. 
Hence by Axioms 2 and 3, 


1 = P(S) = P(A) + P(A®), ~— thus ~—s P(A®) = 1 — P(A). a 


SEC. 24.3 Probability 1021 


EXAMPLE 2 


THEOREM 2 


EXAMPLE 3 


THEOREM 3 


PROOF 


Coin Tossing 
Five coins are tossed simultaneously. Find the probability of the event A: At least one head turns up. Assume 


that the coins are fair. 


Solution. Since each coin can turn up heads or tails, the sample space consists of 27 = 32 outcomes. Since 
the coins are fair, we may assign the same probability (1/32) to each outcome. Then the event A® (No heads 
turn up) consists of only 1 outcome. Hence P(A‘) = 1/32, and the answer is P(A) = 1 — P(AS) = 31/32. B 


The next theorem is a simple extension of Axiom 3, which you can readily prove by 
induction. 


Addition Rule for Mutually Exclusive Events 


For mutually exclusive events Ay,:**, Ay, in a sample space S, 


(8) P(A, U Ag U+::Am) = P(A) + P(Ag) + +++ + P(Am). 


Mutually Exclusive Events 


If the probability that on any workday a garage will get 10-20, 21-30, 31-40, over 40 cars to service is 0.20, 
0.35, 0.25, 0.12, respectively, what is the probability that on a given workday the garage gets at least 21 cars 
to service? 


Solution. Since these are mutually exclusive events, Theorem 2 gives the answer 0.35 + 0.25 + 0.12 = 0.72. 
Check this by the complementation rule. | 


In many cases, events will not be mutually exclusive. Then we have 


Addition Rule for Arbitrary Events 


For events A and B in a sample space, 


(9) P(A U B) = P(A) + P(B) — PAN B). 


C, D, E in Fig. 512 make up A U B and are mutually exclusive (disjoint). Hence by 
Theorem 2, 


P(A U B) = P(C) + PID) + P(E). 


This gives (9) because on the right P(C) + P(D) = P(A) by Axiom 3 and disjointness; 
and P(E) = P(B) — P(D) = P(B) — P(A OM B), also by Axiom 3 and disjointness. wi 


© 


A B 
Fig. 512. Proof of Theorem 3 
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Note that for mutually exclusive events A and B we have A M B = © by definition and, 
by comparing (9) and (6), 


(10) P(O) = 0. 
(Can you also prove this by (5) and (7)?) 


Union of Arbitrary Events 
In tossing a fair die, what is the probability of getting an odd number or a number less than 4? 


Solution. Let A be the event “Odd number” and B the event “Number less than 4.” Then Theorem 3 gives 
the answer 


PAU B)=§+8-8=3 


because A M B = “Odd number less than 4” = {1,3}. za 


Conditional Probability. Independent Events 


Often it is required to find the probability of an event B under the condition that an event 
A occurs. This probability is called the conditional probability of B given A and is denoted 
by P(B|A). In this case A serves as a new (reduced) sample space, and that probability is 
the fraction of P(A) which corresponds to A M B. Thus 


P(A M B) 
(11) P(B|A) = PA) [P(A) # O]. 


Similarly, the conditional probability of A given B is 


P(A B) 
(12) P(A|B) = PB) [P(B) # 0]. 


Solving (11) and (12) for P(A M B), we obtain 


Multiplication Rule 
If A and B are events in a sample space S and P(A) # 0, P(B) # 0, then 


(13) P(A M B) = P(A)P(B|A) = P(B)P(A|B). 


Multiplication Rule 


In producing screws, let A mean “screw too slim” and B “screw too short.” Let P(A) = 0.1 and let the conditional 
probability that a slim screw is also too short be P(B|A) = 0.2. What is the probability that a screw that we pick 
randomly from the lot produced will be both too slim and too short? 


Solution. P(A 2M B) = P(A)P(B|A) = 0.1-0.2 = 0.02 = 2%, by Theorem 4. | 


Independent Events. If events A and B are such that 


(14) P(A {1 B) = P(A)P(B), 
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they are called independent events. Assuming P(A) # 0, P(B) # 0, we see from (11)-(13) 
that in this case 


P(A|B) = P(A), P(B|A) = P(B). 


This means that the probability of A does not depend on the occurrence or nonoccurrence 
of B, and conversely. This justifies the term “independent.” 


Independence of m Events. Similarly, m events A,,---, A, are called independent if 
(15a) P(A, ++: ON Am) = P(Ay):+: P(Am) 
as well as for every k different events Aj;,, Aj,,°-*, Aj. 
(15b) P(Aj, 1 Aj, +++ 1 Aj.) = P(Aj,)P(Aj,) > ** P(Aj,) 
where k = 2,3,---,m— 1. 
Accordingly, three events A, B, C are independent if and only if 


P(A 1 B) = P(A)P(B), 

P(BM C) = P(B)P(C), 

P(C 1 A) = P(C)P(A), 
PANBOC) = P(A)P(B)P(C). 


(16) 


Sampling. Our next example has to do with randomly drawing objects, one at a time, 
from a given set of objects. This is called sampling from a population, and there are 
two ways of sampling, as follows. 


1. In sampling with replacement, the object that was drawn at random is placed back to 
the given set and the set is mixed thoroughly. Then we draw the next object at random. 


2. In sampling without replacement the object that was drawn is put aside. 


Sampling With and Without Replacement 


A box contains 10 screws, three of which are defective. Two screws are drawn at random. Find the probability 
that neither of the two screws is defective. 


Solution. We consider the events 
A: First drawn screw nondefective. 
B: Second drawn screw nondefective. 


Clearly, P(A) = i because 7 of the 10 screws are nondefective and we sample at random, so that each screw 
has the same probability (db) of being picked. If we sample with replacement, the situation before the second 
drawing is the same as at the beginning, and P(B) = ib. The events are independent, and the answer is 


P(A M B) = P(A)P(B) = 0.7- 0.7 = 0.49 = 49%. 


If we sample without replacement, then P(A) = i. as before. If A has occurred, then there are 9 screws left 
in the box, 3 of which are defective. Thus P(B |A) = é = 2 and Theorem 4 yields the answer 


P(A NB) = 75°32 = 47%. 


Is it intuitively clear that this value must be smaller than the preceding one? | 
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PROBLEM SET 2473 


1. 


10. 


11. 


In rolling 3 fair dice, what is the probability of obtaining 
a sum not greater than 16? 


. In rolling 2 fair dice, what is the probability of a sum 


greater than 3 but not exceeding 6? 


. Three screws are drawn at random from a lot of 100 


screws, 10 of which are defective. Find the probability 
of the event that all 3 screws drawn are nondefective, 
assuming that we draw (a) with replacement, (b) without 
replacement. 


. In Prob. 3 find the probability of E: At least 1 defective 


(i) directly, (ii) by using complements; in both cases 
(a) and (b). 


. If a box contains 10 left-handed and 20 right-handed 


screws, what is the probability of obtaining at least 
one right-handed screw in drawing 2 screws with 
replacement? 


. Will the probability in Prob. 5 increase or decrease if we 


draw without replacement. First guess, then calculate. 


. Under what conditions will it make practically no 


difference whether we sample with or without 
replacement? 


. If acertain kind of tire has a life exceeding 40,000 miles 


with probability 0.90, what is the probability that a set 
of these tires on a car will last longer than 40,000 miles? 


. If we inspect photocopy paper by randomly drawing 5 


sheets without replacement from every pack of 500, 
what is the probability of getting 5 clean sheets although 
0.4% of the sheets contain spots? 


Suppose that we draw cards repeatedly and with 
replacement from a file of 100 cards, 50 of which refer 
to male and 50 to female persons. What is the 
probability of obtaining the second “female” card before 
the third “male” card? 

A batch of 200 iron rods consists of 50 oversized rods, 
50 undersized rods, and 100 rods of the desired length. 
If two rods are drawn at random without replacement, 
what is the probability of obtaining (a) two rods of the 


12. 


13. 


14. 


15. 


16. 


17. 
18. 


19. 


desired length, (b) exactly one of the desired length, 
(c) none of the desired length? 


If a circuit contains four automatic switches and we 
want that, with a probability of 99%, during a given 
time interval the switches to be all working, what 
probability of failure per time interval can we admit 
for a single switch? 


A pressure control apparatus contains 3 electronic 
tubes. The apparatus will not work unless all tubes are 
operative. If the probability of failure of each tube 
during some interval of time is 0.04, what is the 
corresponding probability of failure of the apparatus? 


Suppose that in a production of spark plugs the fraction 
of defective plugs has been constant at 2% over a long 
time and that this process is controlled every half hour 
by drawing and inspecting two just produced. Find the 
probabilities of getting (a) no defectives, (b) 1 
defective, (c) 2 defectives. What is the sum of these 
probabilities? 


What gives the greater probability of hitting at least 
once: (a) hitting with probability 1/2 and firing 1 shot, 
(b) hitting with probability 1/4 and firing 2 shots, 
(c) hitting with probability 1/8 and firing 4 shots? First 
guess. 

You may wonder whether in (16) the last relation 
follows from the others, but the answer is no. To see 
this, imagine that a chip is drawn from a box containing 
4 chips numbered 000, 011, 101, 110, and let A, B, C 
be the events that the first, second, and third digit, 
respectively, on the drawn chip is 1. Show that then 
the first three formulas in (16) hold but the last one 
does not hold. 


Show that if B is a subset of A, then P(B) S P(A). 


Extending Theorem 4, show that PAM BMC) = 
P(A)P(BIA)P(C|A 2 B). 


Make up an example similar to Prob. 16, for instance, 
in terms of divisibility of numbers. 


24.4 Permutations and Combinations 


Permutations and combinations help in finding probabilities P(A) = a/k by systematically 
counting the number a of points of which an event A consists; here, k is the number of 
points of the sample space S. The practical difficulty is that a may often be surprisingly 
large, so that actual counting becomes hopeless. For example, if in assembling some 
instrument you need 10 different screws in a certain order and you want to draw them 
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randomly from a box (which contains nothing else) the probability of obtaining them in 
the required order is only 1/3,628,800 because there are 


10! = 1°2°3°4°5:6:7°8-:9- 10 = 3,628,800 
orders in which they can be drawn. Similarly, in many other situations the numbers of 


orders, arrangements, etc. are often incredibly large. (If you are unimpressed, take 20 
screws—how much bigger will the number be?) 


Permutations 
A permutation of given things (elements or objects) is an arrangement of these things in 
a row in some order. For example, for three letters a, b, c there are 3! = 1°2°3 =6 


permutations: abc, acb, bac, bca, cab, cba. This illustrates (a) in the following theorem. 


Permutations 


(a) Different things. The number of permutations of n different things taken 
all at a time is 


(1) il = il 02> 3) sos 7 (read “n factorial”). 
(b) Classes of equal things. If n given things can be divided into c classes of 


alike things differing from class to class, then the number of permutations of 
these things taken all at a time is 


n! 
(2) or eae (ny +ng++++ +ng=Nn) 
Ny-Ng-"""*Ne: 


Where nj is the number of things in the jth class. 


(a) There are n choices for filling the first place in the row. Then n — | things are still 
available for filling the second place, etc. 


(b) 1 alike things in class 1 make nj! permutations collapse into a single permutation 
(those in which class | things occupy the same 7 positions), etc., so that (2) follows 
from (1). i] 


Illustration of Theorem 1(b) 


If a box contains 6 red and 4 blue balls, the probability of drawing first the red and then the blue balls is 
P = 6!4!/10! = 1/210 ~ 0.5%. a 


A permutation of 1 things taken k at a time is a permutation containing only k of the 
n given things. Two such permutations consisting of the same k elements, in a different 
order, are different, by definition. For example, there are 6 different permutations of the 
three letters a, b, c, taken two letters at a time, ab, ac, bc, ba, ca, cb. 

A permutation of things taken k at a time with repetitions is an arrangement obtained 
by putting any given thing in the first position, any given thing, including a repetition of the 
one just used, in the second, and continuing until & positions are filled. For example, there 
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are 3° = 9 different such permutations of a, b, c taken 2 letters at a time, namely, the 
preceding 6 permutations and aa, bb, cc. You may prove (see Team Project 14): 


Permutations 
The number of different permutations of n different things taken k at a time without 


repetitions is 


n! 


(3a) a i ea ae a es 


and with repetitions is 


(3b) n*, 


Illustration of Theorem 2 


In an encrypted message the letters are arranged in groups of five letters, called words. From (3b) we see that 
the number of different such words is 


26° = 11,881,376. 
From (3a) it follows that the number of different such words containing each letter no more than once is 


26!/(26 — 5)! = 26+ 25+ 24+ 23-22 = 7,893,600. fei] 


Combinations 


In a permutation, the order of the selected things is essential. In contrast, a combination 
of given things means any selection of one or more things without regard to order. There 
are two kinds of combinations, as follows. 

The number of combinations of n different things, taken k at a time, without 
repetitions is the number of sets that can be made up from the n given things, each set 
containing k different things and no two sets containing exactly the same k things. 

The number of combinations of n different things, taken k at a time, with repetitions 
is the number of sets that can be made up of k things chosen from the given n things, 
each being used as often as desired. 

For example, there are three combinations of the three letters a, b, c, taken two letters 
at a time, without repetitions, namely, ab, ac, bc, and six such combinations with 
repetitions, namely, ab, ac, bc, aa, bb, cc. 


Combinations 


The number of different combinations of n different things taken, k at a time, without 
repetitions, is 


(4a) (") n! n(n — 1)°--(n-—k 4+ 1) 


kk} kin-—k! 1:2---k ; 


and the number of those combinations with repetitions is 


ve ara 
(4b) k 
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The statement involving (4a) follows from the first part of Theorem 2 by noting that there 
are k! permutations of k things from the given n things that differ by the order of the 
elements (see Theorem 1), but there is only a single combination of those k things of the 
type characterized in the first statement of Theorem 3. The last statement of Theorem 3 
can be proved by induction (see Team Project 14). o 


Illustration of Theorem 3 


The number of samples of five lightbulbs that can be selected from a lot of 500 bulbs is [see (4a)] 


500 500! 500 - 499 - 498 - 497 - 496 
255,244,687,600. |_| 


5 51495! 1+2-3-+4:-5 
Factorial Function 
In (1)-(4) the factorial function is basic. By definition, 
(5) 0! = 1. 
Values may be computed recursively from given values by 
(6) (n+ 1)! = (n+ Int. 


For large n the function is very large (see Table A3 in App. 5). A convenient approximation 
for large n is the Stirling formula” 


n 
(7) nl ~ V2a0 (2) (e = 2.718---) 
where ~ is read “asymptotically equal” and means that the ratio of the two sides of (7) 


approaches | as n approaches infinity. 


Stirling Formula 


n! By (7) Exact Value Relative Error 

4! 23.5 24 2.1% 

10! 3,598,696 3,628,800 0.8% 

20! 2.42279 - 1018 2,432,902,008, 176,640,000 0.4% & 


Binomial Coefficients 


The binomial coefficients are defined by the formula 


(8) 


(<) ala — 1)(a — 2)-+-(a-—k + 1) 
= (k = 0, integer). 


k k! 


2JAMES STIRLING (1692-1770), Scots mathematician. 


1028 


CHAP. 24 Data Analysis. Probability Theory 


The numerator has k factors. Furthermore, we define 


a . : 0) 
(9) (“) = 1, in particular, ( ) = 1, 


For integer a = n we obtain from (8) 


a (=o) 


Binomial coefficients may be computed recursively, because 


re +(e eCity) 


Formula (8) also yields 


D (“")- =pe(™ Et) 
(12) k =(-)) k 


(k = 0, integer). 


(k = 0, integer) 


(m > 0). 

There are numerous further relations; we mention two important ones, 
ae S a) (k=0,n=1, 
S k k+1 both integer) 


and 


(14) = 


ea 


(r = 0, integer). 


PROBLEM SET 24-4 


Note the large numbers in the answers to some of these 
problems, which would make counting cases hopeless! 


1. 


2s 


In how many ways can a company assign 10 drivers to 
n buses, one driver to each bus and conversely? 

List (a) all permutations, (b) all combinations without 
repetitions, (c) all combinations with repetitions, of 5 
letters a, e, i, o, u taken 2 at a time. 


. If abox contains 4 rubber gaskets and 2 plastic gaskets, 


what is the probability of drawing (a) first the plastic 
and then the rubber gaskets, (b) first the rubber and 
then the plastic ones? Do this by using a theorem and 
checking it by multiplying probabilities. 


. An urn contains 2 green, 3 yellow, and 5 red balls. We 


draw 1 ball at random and put it aside. Then we draw 
the next ball, and so on. Find the probability of drawing 


at first the 2 green balls, then the 3 yellow ones, and 
finally the red ones. 


. In how many different ways can we select a committee 


consisting of 3 engineers, 2 physicists, and 2 computer 
scientists from 10 engineers, 5 physicists, and 6 
computer scientists? First guess. 


. How many different samples of 4 objects can we draw 


from a lot of 50? 


. Of a lot of 10 items, 2 are defective. (a) Find the 


number of different samples of 4. Find the number of 
samples of 4 containing (b) no defectives, (c) 1 
defective, (d) 2 defectives. 


. Determine the number of different bridge hands. (A 


bridge hand consists of 13 cards selected from a full 
deck of 52 cards.) 
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9. 


10. 


11. 


12. 


13. 


In how many different ways can 6 people be seated at 
a round table? 


If a cage contains 100 mice, 3 of which are male, what 
is the probability that the 3 male mice will be included 
if 10 mice are randomly selected? 


How many automobile registrations may the police 
have to check in a hit-and-run accident if a witness 
reports KDP7 and cannot remember the last two digits 
on the license plate but is certain that all three digits 
were different? 


If 3 suspects who committed a burglary and 6 innocent 
persons are lined up, what is the probability that a 
witness who is not sure and has to pick three persons 
will pick the three suspects by chance? That the witness 
picks 3 innocent persons by chance? 


CAS PROJECT. Stirling formula. (a) Using (7), 
compute approximate values of n! for n = 1,---, 20. 

(b) Determine the relative error in (a). Find an 
empirical formula for that relative error. 

(c) An upper bound for that relative error is 
e¥/12" _ 1 Try to relate your empirical formula to this. 
(d) Search through the literature for further information 
on Stirling’s formula. Write a short eassy about your 
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15. 
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findings, arranged in logical order and illustrated with 
numeric examples. 

TEAM PROJECT. Permutations, Combinations. 
(a) Prove Theorem 2. 

(b) Prove the last statement of Theorem 3. 

(c) Derive (11) from (8). 

(d) By the binomial theorem, 


@tbp=> (") abr, 


k=0 


so that a®b”—* has the coefficient (7). Can you 


conclude this from Theorem 3 or is this a mere 
coincidence? 

(e) Prove (14) by using the binomial theorem. 

(f) Collect further formulas for binomial coefficients 
from the literature and illustrate them numerically. 


Birthday problem. What is the probability that in a 
group of 20 people (that includes no twins) at least 
two have the same birthday, if we assume that the 
probability of having birthday on a given day is 1/365 
for every day. First guess. Hint. Consider the com- 
plementary event. 


In Sec. 24.1 we considered frequency distributions of data. These distributions show the 
absolute or relative frequency of the data values. Similarly, a probability distribution 
or, briefly, a distribution, shows the probabilities of events in an experiment. The quantity 
that we observe in an experiment will be denoted by X and called a random variable 
(or stochastic variable) because the value it will assume in the next trial depends on 
chance, on randomness—if you roll a die, you get one of the numbers from | to 6, but 
you don’t know which one will show up next. Thus X = Number a die turns up is a 
random variable. So is X = Elasticity of rubber (elongation at break). (“Stochastic” means 
related to chance.) 

If we count (cars on a road, defective screws in a production, tosses until a die shows 
the first Six), we have a discrete random variable and distribution. If we measure 
(electric voltage, rainfall, hardness of steel), we have a continuous random variable and 
distribution. Precise definitions follow. In both cases the distribution of X is determined 
by the distribution function 


(1) PRX =X), 
this is the probability that in a trial, X will assume any value not exceeding x. 


CAUTION! The terminology is not uniform. F(x) is sometimes also called the 
cumulative distribution function. 
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For (1) to make sense in both the discrete and the continuous case we formulate con- 
ditions as follows. 


Random Variable 


A random variable X is a function defined on the sample space S of an experiment. 
Its values are real numbers. For every number a the probability 


P(X = a) 
with which X assumes a is defined. Similarly, for any interval J the probability 
P(X ET) 


with which X assumes any value in J is defined. 


Although this definition is very general, in practice only a very small number of distributions 
will occur over and over again in applications. 

From (1) we obtain the fundamental formula for the probability corresponding to an 
intervala <x Sb, 


(2) P(a<X Sb) = Fb) — F(a). 


This follows because X = a (“X assumes any value not exceeding a”) and a < X =b 
(“X assumes any value in the interval a < x S b”) are mutually exclusive events, so that 
by (1) and Axiom 3 of Definition 2 in Sec. 24.3 


F(b) = P(X Sb) = P(X Sa) t+ Pa<XSb) 
= F(a) + Pa<X Sb) 


and subtraction of F(a) on both sides gives (2). 


Discrete Random Variables and Distributions 


By definition, a random variable X and its distribution are discrete if X assumes only finitely 
many or at most countably many values x1, x2, x3,-*-, called the possible values of X, 
with positive probabilities py, = P(X = x1), po = P(X = x2), p3 = P(X = %3),°°°, 
whereas the probability P(X € /) is zero for any interval J containing no possible value. 

Clearly, the discrete distribution of X is also determined by the probability function 
F(x) of X, defined by 


joy ke = aes 
(3) fe =4 G = 1,2,-+*), 


QO otherwise 


From this we get the values of the distribution function F(x) by taking sums, 


(4) Fa = > fe) = Dd pj 


4 = 
uj=x =x 
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where for any given x we sum all the probabilities p; for which x; is smaller than or equal 
to that of x. This is a step function with upward jumps of size p; at the possible values 
x; of X and constant in between. 


EXAMPLE 1_ Probability Function and Distribution Function 


Figure 513 shows the probability function f(x) and the distribution function F(x) of the discrete random variable 
X = Number a fair die turns up. 


X has the possible values x = 1, 2,3, 4,5, 6 with probability 1/6 each. At these x the distribution function 
has upward jumps of magnitude 1/6. Hence from the graph of f(x) we can construct the graph of F(x) and 


conversely. 
In Figure 513 (and the next one) at each jump the fat dot indicates the function value at the jump! B 
f(x) f(x) 
Tele 
“TLL sgeerr ae 
al LL plied ca ttl Eros 


6) 5 x 
Fig. 513. Probability function f(x) Fig. 514. Probability function f(x) and 
and distribution function F(x) of the distribution function F(x) of the random 
random variable X = Number variable X = Sum of the two numbers 
obtained in tossing a fair die once obtained in tossing two fair dice once 


EXAMPLE 2 _ Probability Function and Distribution Function 


The random variable X = Sum of the two numbers two fair dice turn up is discrete and has the possible values 
2(=1+4 1,3,4,:--,12(=6 + 6). There are 6-6 = 36 equally likely outcomes (1, 1) (1, 2),---, (6, 6), 
where the first number is that shown on the first die and the second number that on the other die. Each such 
outcome has probability 1/36. Now X = 2 occurs in the case of the outcome (1, 1); X = 3 in the case of the 
two outcomes (1, 2) and (2, 1); X = 4 in the case of the three outcomes (1, 3), (2, 2), (3, 1); and so on. Hence 
f(x) = P(X = x) and F(x) = P(X S x) have the values 


x 2 3 4 5 6 W 8 9) 10 11 12 


f(x) 16 2/36 3/36 4/36 5/36) 66/36 = 55/36 4/36 33/36 2/36 1/36 
Fx) 1/36 3/36 6/36 10/36 15/36 21/36 26/36 30/36 33/36 35/36 36/36 


Figure 514 shows a bar chart of this function and the graph of the distribution function, which is again a step 
function, with jumps (of different height!) at the possible values of X. a 
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Two useful formulas for discrete distributions are readily obtained as follows. For the 
probability corresponding to intervals we have from (2) and (4) 


Pa<X=b)=Fb)-Fa= > 2; (X discrete). 


a<x;Sb 


(5) 


This is the sum of all probabilities p; for which x; satisfies a < x; = b. (Be careful about 
< and S !) From this and P(S) = 1 (Sec. 24.3) we obtain the following formula. 


6) p= 1 (sum of all probabilities). 
ij 
Illustration of Formula (5) 


In Example 2, compute the probability of a sum of at least 4 and at most 8. 


Solution. PG <X <8) = F8) — F3) = 28 - 3 =2. a 


Waiting Time Problem. Countably Infinite Sample Space 


In tossing a fair coin, let X = Number of trials until the first head appears. Then, by independence of events 
(Sec. 24.3), 


PX =1)=P(H) =3 (H = Head) 
P(X =2)=P(TH) =4-:4 =}3 (T = Tail) 
P(X = 3) = P(TTH) = 5°35 °3=% etc. 
and in general P(X = n) = ¢ y",n = 1, 2,---. Also, (6) can be confirmed by the sum formula for the geometric 
series, 
1 1 1 
bob ope 1 : 
2 4 8 1-4 
=-14+2-1 | 


Continuous Random Variables and Distributions 


Discrete random variables appear in experiments in which we count (defectives in a 
production, days of sunshine in Chicago, customers standing in a line, etc.). Continuous 
random variables appear in experiments in which we measure (lengths of screws, voltage 
in a power line, Brinell hardness of steel, etc.). By definition, a random variable X and 
its distribution are of continuous type or, briefly, continuous, if its distribution function 
F(x) [defined in (1)] can be given by an integral 


x 


(7) F(x) = | f(v) dv 
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(we write v because x is needed as the upper limit of the integral) whose integrand f(x), 
called the density of the distribution, is nonnegative, and is continuous, perhaps except 
for finitely many x-values. Differentiation gives the relation of f to F' as 


(8) 7G) = F'G@) 


for every x at which f(x) is continuous. 
From (2) and (7) we obtain the very important formula for the probability corresponding 
to an interval: 


b 
(9) Pa < X Sb) = F(b) — F(a) = | fv) dv. 


a 


This is the analog of (5). 
From (7) and P(S) = 1 (Sec. 24.3) we also have the analog of (6): 


oo 


(10) | fold = 1. 


—0 


Continuous random variables are simpler than discrete ones with respect to intervals. 
Indeed, in the continuous case the four probabilities corresponding to a< X Sb, 
a<X<baSX <b, anda =X 3S bd with any fixed a and b (> a) are all the same. 
Can you see why? (Answer. This probability is the area under the density curve, as in 
Fig. 515, and does not change by adding or subtracting a single point in the interval of 
integration.) This is different from the discrete case! (Explain.) 

The next example illustrates notations and typical applications of our present formulas. 


Curve of density 


f(x) 
Pla<X<b) 


a b x 


Fig. 515. Example illustrating formula (9) 


Continuous Distribution 


Let X have the density function f(x) = 0.75(1 — x?) if —1 =x S 1 and zero otherwise. Find the distribution 
function. Find the probabilities P(-3 =x 3) and P(A SX S 2). Find x such that P(X S x) = 0.95. 


Solution. From (7) we obtain F(x) = 0 if x S —1, 


x 
F(x) = 075 | (1 — v) dv = 0.5 + 0.75x — 0.25x3 if-l<x<1, 
-1 


and F(x) = lifx > 1. From this and (9) we get 


1/2 
P(-§ =X Sb) = FQ) — F(-3) = 0.75 | (1 — v?) dv = 68.75% 
-1/2 


1034 


CHAP. 24 Data Analysis. Probability Theory 


(because P(-4 =xs 3) = P(-4 <XS 4) for a continuous distribution) and 


1 
F(4) = 0.75 | (1 — v2) du = 31.64%. 


1/4 


(Note that the upper limit of integration is 1, not 2. Why?) Finally, 


P(X =x) 


F(x) 


0.5 + 0.75x — 0.25x? = 0.95. 


Algebraic simplification gives 3x — x? = 1.8. A solution is x = 0.73, approximately. 


Sketch f(x) and mark x = -3, 3, : 


the curve. Sketch also F(x). 


a, and 0.73, so that you can see the results (the probabilities) as areas under 


Further examples of continuous distributions are included in the next problem set and in 


later sections. 


PROBLEM SET 2475 


1. 


Graph the probability function f(x) = kx? (x = 1,2, 3, 
4,5: k suitable) and the distribution function. 


. Graph the density function f(x) = kx?(0S x5; 


k suitable) and the distribution function. 


. Uniform distribution. Graph f and F when the density 


of X is f(x) = k = constif —2 Sx S2 and 0 else- 
where. Find P(O S X S 2). 


. In Prob. 3 find c and ¢ such that P(-c < X <c) = 


95% and P(O< X < ¢) = 95%. 


. Graph f and F when f(—2) = (2) = 3, f(-D = 


fM= 3. Can f have further positive values? 


. A box contains 4 right-handed and 6 left-handed 


screws. Two screws are drawn at random without 
replacement. Let X be the number of left-handed 
screws drawn. Find the probabilities P(X = 0), 
P(X =1), P(X =2), PAU<xX<2), PX 21), 
P(X 2 1), P(X > 1), and P(0.5 < X < 10). 


. Let X be the number of years before a certain kind of 


pump needs replacement. Let X have the probability 
function f(x) = kx?, x = 0, 1, 2, 3, 4, Find k. Sketch f 
and F. 


. Graph the distribution function F(x) = 1 — e° if 


x > 0, F(x) = Oif x S 0, and the density f(x). Find x 
such that F(x) = 0.9. 


. Let X [millimeters] be the thickness of washers. 


Assume that X has the density f(x)=k if 
0.9 <x < 1.1 and 0 otherwise. Find k. What is the 
probability that a washer will have thickness between 
0.95 mm and 1.05 mm? 


10. 


11. 


12 


13. 


14. 


15. 


If the diameter X of axles has the density f(x) = k if 
119.9 =x =120.1 and O otherwise, how many 
defectives will a lot of 500 axles approximately contain 
if defectives are axles slimmer than 119.91 or thicker 
than 120.09? 


Find the probability that none of three bulbs in a traffic 
signal will have to be replaced during the first 1500 
hours of operation if the lifetime X of a bulb is a random 
variable with the density f(x) = 6[0.25 — @ — 1.5)?] 
when | = x S 2 andf(x) = 0 otherwise, where x is 
measured in multiples of 1000 hours. 


Let X be the ratio of sales to profits of some company. 
Assume that X has the distribution function F(x) = 0 if 
x <2, Fx) = (x? — 4)/5 if 25x%<3, Fo) = lif 
x 2 3. Find and sketch the density. What is the probability 
that X is between 2.5 (40% profit) and 5 (20% profit)? 


Suppose that in an automatic process of filling oil 
cans, the content of a can (in gallons) is Y = 100 + X, 
where X is a random variable with density 
f(x) = 1 — |x| when |x| S1 and 0 when |x| > 1. 
Sketch f(x) and F(x). In a lot of 1000 cans, about how 
many will contain 100 gallons or more? What is the 
probability that a can will contain less than 99.5 
gallons? Less than 99 gallons? 


Find the probability function of X = Number of times 
a fair die is rolled until the first Six appears and show 
that it satisfies (6). 


Let X be a random variable that can assume every real 
value. What are the complements of the events X S b, 
X<bX20X>ce,b8EX8c¢c,b<XZc? 
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24.6 Mean and Variance of a Distribution 


EXAMPLE-1 


EXAMPLE 2 


The mean yp and variance o” of arandom variable X and of its distribution are the theoretical 
counterparts of the mean x and variance s” of a frequency distribution in Sec. 24.1 and 
serve a similar purpose. Indeed, the mean characterizes the central location and the variance 
the spread (the variability) of the distribution. The mean yw (mu) is defined by 


(a) w= DS xj f (Xj) (Discrete distribution) 
i 
(1) 


C) 


(b) w= | xf (x) dx (Continuous distribution) 


-7 


and the variance o” (sigma square) by 


(a) ie = >y CaS b)*f(x;) (Discrete distribution) 
(2) : 


-) 


(b) o | Gr= w)"f(x) dx (Continuous distribution). 


—7 


o (the positive square root of o”) is called the standard deviation of X and its distribution. 
fis the probability function or the density, respectively, in (a) and (b). 

The mean wp is also denoted by E(X) and is called the expectation of X because it gives 
the average value of X to be expected in many trials. Quantities such as mw and o” that 
measure certain properties of a distribution are called parameters. j and o” are the two 
most important ones. From (2) we see that 


(3) g” >0 


(except for a discrete “distribution” with only one possible value, so that o” = 0). We 
assume that and o” exist (are finite), as is the case for practically all distributions that 
are useful in applications. 


Mean and Variance 


The random variable X = Number of heads in a single toss of a fair coin has the possible values X = 0 and 
X= 1 with probabilities P(X = 0) = 3 and P(X = 1)= 3. From (la) we thus obtain the mean 
w=0 -3 ae +3 = 3, and (2a) yields the variance 


oF = (0-53 + (1-3) 3 = 4. a 


Uniform Distribution. Variance Measures Spread 


The distribution with the density 


1 
fa) = —— if a<x<b 
b-a 
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and f = 0 otherwise is called the uniform distribution on the interval a < x < b. From (1b) (or from Theorem 1, 
below) we find that ~ = (a + b)/2, and (2b) yields the variance 


b 2 2 
+b 1 b- 
=| (s Z ) dx ¢ ay 
i 2 b-a 12 


Figure 516 illustrates that the spread is large if and only if o” is large. 
( 
" (o?-+) ie (-?-3) 
1 o = 7 lr Cr ST 


F(x) F(x) 


| | 
0 1 x -I 0) 1 2 x 


Fig. 516. Uniform distributions having the same mean (0.5) but different variances a” 


Symmetry. We can obtain the mean yw without calculation if a distribution is symmetric. 
Indeed, you may prove 


THEOREM 1 Mean of a Symmetric Distribution 


If a distribution is symmetric with respect to x = c, that is, f(c — x) = f(c + x), 
then ww = c. (Examples | and 2 illustrate this.) 


Transformation of Mean and Variance 


Given a random variable X with mean w and variance o”, we want to calculate the mean 


and variance of X* = a, + dX, where a, and dg are given constants. This problem is 
important in statistics, where it often appears. 


THEOREM 2 Transformation of Mean and Variance 
(a) If a random variable X has mean yp and variance a, then the random 
variable 
(4) X* = ay + aoX (az > 0) 


has the mean y* and variance o**, where 


(5) Bb* =a, + ao and ao*" = ago". 
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PROOF 


(b) In particular, the standardized random variable Z corresponding to X, 
given by 


(6) La 


has the mean 0 and the variance 1. 


We prove (5) for a continuous distribution. To a small interval J of length Ax on the 
x-axis there corresponds the probability f(x)Ax [approximately; the area of a rectangle 
of base Ax and height f(x)]. Then the probability f(~)Ax must equal that for the 
corresponding interval on the x*-axis, that is, f*(x*)Ax*, where f* is the density of X* 
and Ax* is the length of the interval on the x*-axis corresponding to J. Hence for 
differentials we have f*(x*) dx* = f(x) dx. Also, x* = ay + dox by (A), so that (1b) 
applied to X* gives 


a* = | x*F F(x") dx* 


—1 


| (a, + dgx) f(x) dx 


—01 
oo 


as| SQ) dx + a.| xf (x) dx. 


—2 —1 


On the right the first integral equals 1, by (10) in Sec. 24.5. The second intergral is pw. 
This proves (5) for w*. It implies 


x* — p* = (ay + dgx) — (ay + dap) = dag(x — p). 


From this and (2) applied to X*, again using f*(x*) dx* = f(x) dx, we obtain the second 
formula in (5), 


oe? = | (xt — pt) fGr) de* = af | (x — w)?f(x) dx = ado. 


—0 —-7 


For a discrete distribution the proof of (5) is similar. 
Choosing a, = —p/o and ag = 1/o we obtain (6) from (4), writing X* = Z. For these 
a1, dg formula (5) gives w* = 0 and c= 1, as claimed in (b). i] 


Expectation, Moments 


Recall that (1) defines the expectation (the mean) of X, the value of X to be expected on 
the average, written w = E(X). More generally, if g(x) is nonconstant and continuous for 
all x, then g(X) is a random variable. Hence its mathematical expectation or, briefly, its 
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expectation E(g(X)) is the value of g(X) to be expected on the average, defined [similarly 
to (1)] by 


(7) E(g(X)) = Si eiepf@;) or E(g(X) = | B(x) f(x) dx. 
j 


—2 


In the first formula, f is the probability function of the discrete random variable X. In the 
second formula, fis the density of the continuous random variable X. Important special 
cases are the kth moment of X (where k = 1, 2,---) 


(8) E(X*) = Si xffj) or | x(x) dx 


7] —21 


and the kth central moment of X (k = 1, 2,---) 


(9) EX - pl") = DS Gy - w)*f@) or | (x — p) f(a) dx. 


ni) —2 


This includes the first moment, the mean of X 

(10) b= E(X) [(8) with k = 1]. 
It also includes the second central moment, the variance of X 

(11) o” = E([X — p]’) [(9) with k = 2]. 


For later use you may prove 


(12) E(1) = 1. 


PROBLEM SET 24-6 


1-8 


Find the mean and variance of the random variable X with 
probability function or density f(x). 


MEAN, VARIANCE 10. If, in Prob. 9, a defective bolt is one that deviates from 


1.00 cm by more than 0.06 cm, what percentage of 
defectives should we expect? 


1. f(x) = kx (0 S x S 2,k suitable) 11. For what choice of the maximum possible deviation 
2. X = Number a fair die turns up from 1.00 cm shall we obtain 10% defectives in Probs. 9 
3. Uniform distribution on [0, 277] and 10? 

4. ¥Y = V3(X — w)/7 with X as in Prob. 3 12. What total sum can you expect in rolling a fair die 
5. f(x) = de~* (x = 0) 20 times? Do the experiment. Repeat it a number of 


6. f(x) = kd — x?) if —1 =x = 1 and 0 otherwise 


times and record how the sum varies. 


7. f(x = Ce (x = 0) 13. What is the expected daily profit if a store sells X air 


8. X = Number of times a fair coin is flipped until the 
first Head appears. (Calculate yw only.) 


9. If the diameter X [cm] of certain bolts has the density 


conditioners per day with probability f(10) = 0.1, 
fC) = 0.3, fU12) = 0.4, fU13) = 0.2 and the profit 
per conditioner is $55? 


f(x) = kx — 0.9)(1.1 — x) for 0.9<x< 1.1 and 0 14. Find the expectation of g(X) = X%, where X is uniformly 
for other x, what are k, w, and o”? Sketch (x). distributed on the interval -1 Sx 31. 
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15. 


16. 


17. 


18. 


19. 


20. 


A small filling station is supplied with gasoline every 
Saturday afternoon. Assume that its volume X of sales 
in ten thousands of gallons has the probability density 
ft) = 6x1 — x) if OSxS1 and O otherwise. 
Determine the mean, the variance, and the standardized 
variable. 

What capacity must the tank in Prob. 15 have in order 
that the probability that the tank will be emptied in a 
given week be 5%? 

James rolls 2 fair dice, and Harry pays k cents to James, 
where k is the product of the two faces that show on 
the dice. How much should James pay to Harry for 
each game to make the game fair? 

What is the mean life of a lightbulb whose life X [hours] 
has the density f(x) = 0.001e7 °°” (x = 0)? 

Let X be discrete with probability function f(0) = f(3) = 
3. FQ) =fQ) = 3. Find the expectation of X?. 
TEAM PROJECT. Means, Variances, Expectations. 
(a) Show that E(X — w) = 0, 0? = E(X?) — w?. 


(b) Prove (10)-(12). 

(c) Find all the moments of the uniform distribution 
on an intervala Sx Sb. 

(d) The skewness y of a random variable X is defined 
by 


1 
(13) y= —5 UK — py). 
oO 


Show that for a symmetric distribution (whose third 
central moment exists) the skewness is zero. 

(e) Find the skewness of the distribution with density 
f(x) = xe~* when x >0O and f(x) = 0 otherwise. 
Sketch f(x). 

(f) Calculate the skewness of a few simple discrete 
distributions of your own choice. 


(g) Find a nonsymmetric discrete distribution with 
3 possible values, mean 0, and skewness 0. 


24./ Binomial, Poisson, and Hypergeometric 


Distributions 


These are the three most important discrete distributions, with numerous applications. 


Binomial Distribution 


The binomial distribution occurs in games of chance (rolling a die, see below, etc.), 
quality inspection (e.g., counting of the number of defectives), opinion polls (counting 
number of employees favoring certain schedule changes, etc.), medicine (e.g., recording 
the number of patients who recovered on a new medication), and so on. The conditions 


of its occurrence are as follows. 


We are interested in the number of times an event A occurs in 1 independent trials. In 
each trial the event A has the same probability P(A) = p. Then in a trial, A will not occur 
with probability g = 1 — p. Inn trials the random variable that interests us is 


X = Number of times the event A occurs in n trials. 


X can assume the values 0, 1,---,, and we want to determine the corresponding 
probabilities. Now X = x means that A occurs in x trials and in n — x trials it does not 


occur. This may look as follows. 


(1) A AA 


e—__—__Y 


B B::-B. 


oS 
n — x times 


Here B = A‘ is the complement of A, meaning that A does not occur (Sec. 24.2). We now 
use the assumption that the trials are independent, that is, they do not influence each other. 
Hence (1) has the probability (see Sec. 24.3 on independent events) 
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(1*) pep * gp g=rg™. 
x times n — x times 


Now (1) is just one order of arranging x A’s and n — x B’s. We now use Theorem 1(b) 
in Sec. 24.4, which gives the number of permutations of n things (the n outcomes of the 
n trials) consisting of 2 classes, class | containing the n, = x A’s and class 2 containing 
the n — ny =n — x B’s. This number is 


n! = (") 
x!l(n — x)! AGS? 


Accordingly, (1*), multiplied by this binomial coefficient, gives the probability P(X = x) 
of X = x, that is, of obtaining A precisely x times in n trials. Hence X has the probability 
function 


(2) fx) = (") Gia (x = 0,1,-+-, 2) 


and f(x) = 0 otherwise. The distribution of X with probability function (2) is called the 
binomial distribution or Bernoulli distribution. The occurrence of A is called success 
(regardless of what it actually is; it may mean that you miss your plane or lose your watch) 
and the nonoccurrence of A is called failure. Figure 517 shows typical examples. Numeric 
values can be obtained from Table A5 in App. 5 or from your CAS. 

The mean of the binomial distribution is (see Team Project 16) 
(3) = np 
and the variance is (see Team Project 16) 


(4) o” = npg. 


For the symmetric case of equal chance of success and failure (p = g = 3) this gives the 
mean n/2, the variance n/4, and the probability function 


n iL We 
(2*) f@ = (") (5) (x = 0, 1,-:-,n). 


0.5 


i, . : i a 
9% 5 O 5 O 5 O 


p=0.1 p=02 p=0.5 p=0.8 p=0.9 


Fig. 517. Probability function (2) of the binomial distribution for n = 5 and various values of p 
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EXAMPLE=-1 


EXAMPLE 2 


Binomial Distribution 
Compute the probability of obtaining at least two “Six” in rolling a fair die 4 times. 


Solution. p = P(A) = P(‘“Six’) = a q= 3, n = 4. The event “At least two ‘Six’” occurs if we obtain 2 or 
3 or 4 “Six.” Hence the answer is 


para +s0+20=(3) (5) (6) +)G) G+) 


1 171 
(6:25+4-54+1) 13.2%. a 
64 1296 


4 


Poisson Distribution 


The discrete distribution with infinitely many possible values and probability function 


x 


(5) fle) =e (x = 0,1,--:) 


is called the Poisson distribution, named after S. D. Poisson (Sec. 18.5). Figure 518 
shows (5) for some values of jx. It can be proved that this distribution is obtained as a 
limiting case of the binomial distribution, if we let p 0 and n—~ so that the mean 
(4 = np approaches a finite value. (For instance, 4 = np may be kept constant.) The 
Poisson distribution has the mean yu and the variance (see Team Project 16) 


(6) a = pb. 
Figure 518 gives the impression that, with increasing mean, the spread of the distribution 


increases, thereby illustrating formula (6), and that the distribution becomes more and 
more (approximately) symmetric. 


0.5 


1 i" | i" Piles 
5 0 5 0 5 ) 5 10 


u=0.5 p=l b=2 p=5 
Fig. 518. Probability function (5) of the Poisson distribution for various values of uw 


Poisson Distribution 


If the probability of producing a defective screw is p = 0.01, what is the probability that a lot of 100 screws 
will contain more than 2 defectives? 


Solution. The complementary event is A®: Not more than 2 defectives. For its probability we get, from the 
binomial distribution with mean = np = 1, the value [see (2)] 


P(A‘) = co 0.99100 + ey 0.01 + 0.99% + (*"") 0.017 + 0.99%. 
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Since p is very small, we can approximate this by the much more convenient Poisson distribution with mean 
pb = np = 100- 0.01 = 1, obtaining [see (5)] 
P(A) ~e (1 +143) 
= 91.97%. 


Thus P(A) = 8.03%. Show that the binomial distribution gives P(A) = 7.94%, so that the Poisson approximation 
is quite good. 


Parking Problems. Poisson Distribution 


If on the average, 2 cars enter a certain parking lot per minute, what is the probability that during any given 
minute 4 or more cars will enter the lot? 


Solution. To understand that the Poisson distribution is a model of the situation, we imagine the minute to 
be divided into very many short time intervals, let p be the (constant) probability that a car will enter the lot 
during any such short interval, and assume independence of the events that happen during those intervals. Then 
we are dealing with a binomial distribution with very large n and very small p, which we can approximate by 
the Poisson distribution with 


= np = 2, 


because 2 cars enter on the average. The complementary event of the event “4 cars or more during a given 
minute” is “3 cars or fewer enter the lot” and has the probability 


0 al A 3 
FO) + fC) + f2) + FG) (2 hee ) 
0! 1! 2! 3! 


= 0.857. 


Answer: 14.3%. (Why did we consider that complement?) is] 


Sampling with Replacement 


This means that we draw things from a given set one by one, and after each trial we 
replace the thing drawn (put it back to the given set and mix) before we draw the next 
thing. This guarantees independence of trials and leads to the binomial distribution. 
Indeed, if a box contains N things, for example, screws, M of which are defective, the 
probability of drawing a defective screw in a trial is p = M/N. Hence the probability of 
drawing a nondefective screw is g = 1 — p = 1 — M/N, and (2) gives the probability of 
drawing x defectives in n trials in the form 


(7) fa = (") (4) (1 = 1) ‘. (x =0,1,-:-,n). 


Sampling without Replacement. 
Hypergeometric Distribution 


Sampling without replacement means that we return no screw to the box. Then we no 
longer have independence of trials (why?), and instead of (7) the probability of drawing 
x defectives in n trials is 
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EXAMPLE 4 


The distribution with this probability function is called the hypergeometric distribution 
(because its moment generating function (see Team Project 16) can be expressed by the 
hypergeometric function defined in Sec. 5.4, a fact that we shall not use). 


Derivation of (8). By (4a) in Sec. 24.4 there are 


N 
(a) ( ) different ways of picking n things from N, 
n 
M\ .. cits 
(b) ( ) different ways of picking x defectives from M, 
x 


N-M 
(c) ( ) different ways of picking n — x nondefectives from N — M, 
N= X 


and each way in (b) combined with each way in (c) gives the total number of mutually 
exclusive ways of obtaining x defectives in n drawings without replacement. Since (a) is 
the total number of outcomes and we draw at random, each such way has the probability 


N 
i/( From this, (8) follows. ia] 
n 


The hypergeometric distribution has the mean (Team Project 16) 
(9) =n 


and the variance 


(10) gre nM(N — M)(N — n) 
N7(N — 1) 


Sampling with and without Replacement 


We want to draw random samples of two gaskets from a box containing 10 gaskets, three of which are defective. 
Find the probability function of the random variable X = Number of defectives in the sample. 


Solution. We have N = 10,M = 3,N — M =7,n = 2. For sampling with replacement, (7) yields 


ae (") ( ; ) ( : y f(0) = 0.49, f(1) = 0.42, f(2) = 0.09 
x) = =0. LG = G¢ies 
, x/ \10 10 * . $ . 


For sampling without replacement we have to use (8), finding 


-(*)/ 7 ye) fh) = 1) =e OT. A ew aD 5 
BOA higa aya SON I gets ON, Ne 
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If N, M, and N — M are large compared with n, then it does not matter too much whether 
we sample with or without replacement, and in this case the hypergeometric distribution 
may be approximated by the binomial distribution (with p = M/N), which is somewhat 


simpler. 


Hence, in sampling from an indefinitely large population (“infinite population’), we 
may use the binomial distribution, regardless of whether we sample with or without 


replacement. 


PROBLEM SET 24-7 


1. 


Mark the positions of w in Fig. 517. Comment. 


2. Graph (2) for n = 8 as in Fig. 517 and compare with 


10. 


. Rutherford—Geiger experiments. In 


Fig. 517. 


. In Example 3, if 5 cars enter the lot on the average, 


what is the probability that during any given minute 6 
or more cars will enter? First guess. Compare with 
Example 3. 


. How do the probabilities in Example 4 of the text 


change if you double the numbers: drawing 4 gaskets 
from 20, 6 of which are defective? First guess. 


. Five fair coins are tossed simultaneously. Find the 


probability function of the random variable X = Number 
of heads and compute the probabilities of obtaining no 
heads, precisely 1 head, at least 1 head, not more than 
4 heads. 


. Suppose that 4% of steel rods made by a machine are 


defective, the defectives occurring at random during 
production. If the rods are packaged 100 per box, what 
is the Poisson approximation of the probability that a 
given box will contain x = 0, 1,---,5 defectives? 


. Let X be the number of cars per minute passing a certain 


point of some road between 8 AM. and 10 AM. ona 
Sunday. Assume that X has a Poisson distribution with 
mean 5. Find the probability of observing 4 or fewer 
cars during any given minute. 


. Suppose that a telephone switchboard of some 


company on the average handles 300 calls per hour, 
and that the board can make at most 10 connections 
per minute. Using the Poisson distribution, estimate the 
probability that the board will be overtaxed during a 
given minute. (Use Table A6 in App. 5 or your CAS.) 
1910, E. 
Rutherford and H. Geiger showed experimentally that 
the number of alpha particles emitted per second in a 
radioactive process is a random variable X having a 
Poisson distribution. If X has mean 0.5, what is the 
probability of observing two or more particles during 
any given second? 

Let p = 2% be the probability that a certain type of 
lightbulb will fail in a 24-hour test. Find the probability 


11. 


12. 


13. 


14. 


15. 


16. 


that a sign consisting of 15 such bulbs will burn 24 
hours with no bulb failures. 


Guess how much less the probability in Prob. 10 would 
be if the sign consisted of 100 bulbs. Then calculate. 


Suppose that a certain type of magnetic tape contains, 
on the average, 2 defects per 100 meters. What is the 
probability that a roll of tape 300 meters long will 
contain (a) x defects, (b) no defects? 


Suppose that a test for extrasensory perception consists 
of naming (in any order) 3 cards randomly drawn from 
a deck of 13 cards. Find the probability that by chance 
alone, the person will correctly name (a) no cards, (b) 1 
card, (c) 2 cards, (d) 3 cards. 


If a ticket office can serve at most 4 customers per 
minute and the average number of customers is 120 per 
hour, what is the probability that during a given minute 
customers will have to wait? (Use the Poisson 
distribution, Table 6 in Appendix 5.) 


Suppose that in the production of 60-ohm radio 
resistors, nondefective items are those that have a 
resistance between 58 and 62 ohms and the probability 
of a resistor’s being defective is 0.1%. The resistors 
are sold in lots of 200, with the guarantee that all 
resistors are nondefective. What is the probability that 
a given lot will violate this guarantee? (Use the Poisson 
distribution.) 

TEAM PROJECT. Moment Generating Function. 
The moment generating function G(¢) is defined by 


Git) = Ele) = > e™f(xj) 


J 
or 


G(t) = E(e’*) = | ef (x) dx 
where X is a discrete or continuous random variable, 
respectively. 
(a) Assuming that termwise differentiation and differ- 
entiation under the integral sign are permissible, show 
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17. 


that E(X*) = G0), where G” = d*G/dt", in 
particular, u = G'(0). 

(b) Show that the binomial distribution has the 
moment generating function 


n n 
G(t) = > e= (") pg’ = = > (")cvedta 
«x=0 os x=0 me 
= (pe +g)”. 
(c) Using (b), prove (3). 
(d) Prove (4). 
(e) Show that the Poisson distribution has the moment 
generating function G(t) = e “e"* and prove (6). 


(f) Prove x (M) =u (M~ a 
x x= 1 


Using this, prove (9). 
Multinomial distribution. Suppose a trial can result 
in precisely one of k mutually exclusive events 


18. 


24.8 Normal Distribution 


Turning from discrete to continuous distributions, in this section we discuss the normal 
distribution. This is the most important continuous distribution because in applications many 
random variables are normal random variables (that is, they have a normal distribution) 
or they are approximately normal or can be transformed into normal random variables in a 
relatively simple fashion. Furthermore, the normal distribution is a useful approximation of 
more complicated distributions, and it also occurs in the proofs of various statistical tests. 
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Aj,°**,A, with probabilities p1,---, px, respectively, 
where p; +--+: + py = 1. Suppose that n independent 
trials are performed. Show that the probability of 
getting x7 Aq’s,---, x, Ax’s is 


n! 
fern) = Fo 


ey os 

where OS xjSn, j=l,---,k, and xy ts:- + 
xX; =n. The distribution having this probability 
function is called the multinomial distribution. 

A process of manufacturing screws is checked every 
hour by inspecting n screws selected at random from 
that hour’s production. If one or more screws are 
defective, the process is halted and carefully examined. 
How large should n be if the manufacturer wants the 
probability to be about 95% that the process will be 
halted when 10% of the screws being produced are 
defective? (Assume independence of the quality of any 
screw from that of the other screws.) 


The normal distribution or Gauss distribution is defined as the distribution with the 


density 
; 2 
p= 
() fx) = exp | — (==4) (7 > 0) 
277 oN G 
where exp is the exponential function with base e = 2.718---. This is simpler than it may 


at first look. f(x) has these features (see also Fig. 519). 


1. is the mean and o the standard deviation. 


2. 1/(o 277) is a constant factor that makes the area under the curve of f(x) from — 
to © equal to 1, as it must be by (10), Sec. 24.5. 


3. The curve of f(x) is symmetric with respect to x = mw because the exponent is 
quadratic. Hence for 4. = 0 it is symmetric with respect to the y-axis x = 0 (Fig. 519, 


“bell-shaped curves”). 


4. The exponential function in (1) goes to zero very fast—the faster the smaller the 
standard deviation o is, as it should be (Fig. 519). 
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f(x) 


=2 =1 i) . 2 x 


Fig. 519. Density (1) of the normal distribution with 4 = O for various values of o 


Distribution Function F(x) 


From (7) in Sec. 24.5 and (1) we see that the normal distribution has the distribution 


function 
ile ia Uo ee 
Ra a) 
oV2T J. 2 NOG: 


Here we needed x as the upper limit of integration and wrote v (instead of x) in the integrand. 
For the corresponding standardized normal distribution with mean 0 and standard 
deviation 1 we denote F(x) by ®(z). Then we simply have from (2) 


(2) F(x) = dv. 


(3) (z) = ae | eo! dy, 


—2« 


This integral cannot be integrated by one of the methods of calculus. But this is no serious 
handicap because its values can be obtained from Table A7 in App. 5 or from your CAS. 
These values are needed in working with the normal distribution. The curve of ®(z) is 
S-shaped. It increases monotone (why?) from 0 to | and intersects the vertical axis at 3 
(why?), as shown in Fig. 520. 


Relation Between F(x) and ®(z). Although your CAS will give you values of F(x) in 
(2) with any mw and o directly, it is important to comprehend that and why any such an 


F(x) can be expressed in terms of the tabulated standard ®(z), as follows. 


y 


P(x) 


1.0- 


0.8 


Fig. 520. Distribution function ®(z) of the normal distribution with mean O and variance 1 


SEC. 24.8 Normal Distribution 


THEOREM 1 


PROOF 


THEOREM 2 


PROOF 
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Use of the Normal Table A7 in App. 5 


The distribution function F(x) of the normal distribution with any w and o [see (2)] 
is related to the standardized distribution function ®(z) in (3) by the formula 


(4) F(x) = o(* = “). 


oO 


Comparing (2) and (3) we see that we should set 


Then v = x gives 


as the new upper limit of integration. Alsov — w = ou, thus dv = o du. Together, since 
o drops out, 


(x- w/o Po 
F(x) = : | oH edu= of =) | 


aoV27 


—2 


Probabilities corresponding to intervals will be needed quite frequently in statistics in 
Chap. 25. These are obtained as follows. 


Normal Probabilities for Intervals 


The probability that a normal random variable X with mean jw and standard 
deviation o assume any value in an interval a < x Sb is 


“*)-a(**) 


Formula (2) in Sec. 24.5 gives the first equality in (5), and (4) in this section gives the 
second equality. a 


(5) Pia < XS b) = F(b) — Fa) = o 


Numeric Values 


In practical work with the normal distribution it is good to remember that about 2 of all values 
of X to be observed will lie between wp + o, about 95% between ww + 20, and practically all 
between the three-sigma limits 4. + 30. More precisely, by Table A7 in App. 5, 


(a) P(e OG = X= to) = 68% 
(6) (b) Ayn, = op << CS fi se Da) = VS.S%e 


(c) IAG, = Stor SOC SS fi ce SO) = LI, 


Formulas (6a) and (6b) are illustrated in Fig. 521. 
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The formulas in (6) show that a value deviating from uz by more than a, 20, or 30 will 
occur in one of about 3, 20, and 300 trials, respectively. 


68% 95.5% 


16% 16% 


Fig. 521. Illustration of formula (6) 


In tests (Chap. 25) we shall ask, conversely, for the intervals that correspond to certain 
given probabilities; practically most important are the probabilities of 95%, 99%, and 
99.9%. For these, Table A8 in App. 5 gives the answers uw + 20, + 2.60, and 
fu + 3.30, respectively. More precisely, 


(a) P(uw — 1.960 < X Spt 1.960) = 95% 
(7) (b) P(w — 2.580 <X Spt 2.580) = 99% 
(c) P(w — 3.290 < X Sp + 3.290) = 99.9%. 


Working with the Normal Tables A7 and A8 in App. 5 


There are two normal tables in App. 5, Tables A7 and A8. If you want probabilities, use 
Table A7. If probabilities are given and corresponding intervals or x-values are wanted, 
use Table A8. The following examples are typical. Do them with care, verifying all values, 
and don’t just regard them as dull exercises for your software. Make sketches of the density 
to see whether the results look reasonable. 


EXAMPLE 1_ Reading Entries from Table A7 
If X is standardized normal (so that w = 0,0 = 1), then 


P(X S 2.44) = 0.9927 ~ 994 % 


P(X S —1.16) = 1 — ©(1.16) = 1 — 0.8770 = 0.1230 = 12.3% 


P(X 2 1) = 1—- PX S1) = 1 — 0.8413 = 0.1587) by (7), Sec. 24.3 


P(U1.0 S X S 1.8) = ®(1.8) — B(1.0) = 0.9641 — 0.8413 = 0.1228. | 


EXAMPLE 2. Probabilities for Given Intervals, Table A7 


Let X be normal with mean 0.8 and variance 4 (so that o@ = 2). Then by (4) and (5) 


2.44 — 0.80 
P(X S 2.44) = F(2.44) o( 5 ) (0.82) = 0.7939 ~ 80% 


or, if you like it better, (similarly in the other cases) 


X—0.80 _ 2.44 — 0.80 
P(X = 2.44) r( 7; = ; ) P(Z = 0.82) = 0.7939 


1-08 
P(X 21) =1-P(XSZ1)=1 0 ; ) 1 — 0.5398 = 0.4602 


P(1.0 = X = 1.8) = ®(0.5) — ®(0.1) = 0.6915 — 0.5398 = 0.1517. | 
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EXAMPLE 3. Unknown Values c for Given Probabilities, Table A8 


Let X be normal with mean 5 and variance 0.04 (hence standard deviation 0.2). Find c or k corresponding to 
the given probability 


— 5, eas 5 
P(X Sc) = 95%, a(S ) = 95%, S~ > = 1.645, ¢ = 5,329 
0.2 0.2 
PO -kS2X25+k) = 90%, 5 +k = 5.329 (as before; why?) 
c='5 
PX 2c) = 1%, thus P(X = c) = 99%, 02 = 2,326, c = 5.465. ia 


EXAMPLE 4 _ Defectives 


In a production of iron rods let the diameter X be normally distributed with mean 2 in. and standard deviation 
0.008 in. 


(a) What percentage of defectives can we expect if we set the tolerance limits at 2 + 0.02 in.? 
(b) How should we set the tolerance limits to allow for 4% defectives? 


Solution. (a) 14 % because from (5) and Table A7 we obtain for the complementary event the probability 


2.02 — 20) o( - 2°) 
0.008 0.008 


= (2.5) — B(-2.5) 
= 0.9938 — (1 — 0.9938) 


P(1.98 = X = 2.02) a 


= 0.9876 
= 983%. 
(b) 2 + 0.0164 because, for the complementary event, we have 
0.96 = P2-—cSxXS2+ 0c) 
or 
0.98 = P(X 32+ 0c) 


so that Table A8 gives 


a ond 
0.98 = o(2#<— ) 
0.008 


2 6S 2 
—___— = 2.054, c = 0.0164. ia 
0.008 


Normal Approximation of the Binomial Distribution 


The probability function of the binomial distribution is (Sec. 24.7) 


(8) fx) = (")pta-* (x = 0, 1,--,n). 


If n is large, the binomial coefficients and powers become very inconvenient. It is of great 
practical (and theoretical) importance that, in this case, the normal distribution provides 
a good approximation of the binomial distribution, according to the following theorem, 
one of the most important theorems in all probability theory. 
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For large n, 


(9) 


Limit Theorem of De Moivre and Laplace 


f@) ~ fF@) 


Here f is given by (8). The function 


(10) FH) — eR cam 
x)= e F Z= 
V 27 Vnpq Vnpq 


is the density of the normal distribution with mean fw = np and variance r= npq 
(the mean and variance of the binomial distribution). The symbol ~ (read 
asymptotically equal) means that the ratio of both sides approaches | as n approaches 
co, Furthermore, for any nonnegative integers a and b (> a), 


b 
Pa@=X=b)=)> 
xr=a 
(11) 
a — np — 0.5 
<< 
Vnpq 


(")ptat* ~ 2B - H@), 
_ b-— np +05 
Vnpq 


A proof of this theorem can be found in [G3] listed in App. 1. The proof shows that the term 
0.5 in a and B is a correction caused by the change from a discrete to a continuous distribution. 


PROBLEM—SET 24-8 


Let X be normal with mean 10 and variance 4. Find 
P(X > 12), P(X < 10), P(X < 11), PQ < X < 13). 


. Let X be normal with mean 105 and variance 25. Find 


P(X = 112.5), P(x > 100), P(110.5 < X < 111.25). 


. Let X be normal with mean 50 and variance 9. 


Determine c such that P(X < c) = 5%, P(X >c) = 
1%, P(50 —c <X < 50+ c) = 50%. 


. Let X be normal with mean 3.6 and variance 0.01. Find 


c such that P(X Sc) = 50%, P(X > c) = 10%, 
P(-c < X — 3.6 Sc) = 99.9%. 


. If the lifetime X of a certain kind of automobile battery 


is normally distributed with a mean of 5 years and a 
standard deviation of 1 year, and the manufacturer wishes 
to guarantee the battery for 4 years, what percentage of 
the batteries will he have to replace under the guarantee? 


. If the standard deviation in Prob. 5 were smaller, would 


that percentage be larger or smaller? 


. A manufacturer knows from experience that the 


resistance of resistors he produces is normal with mean 


10. 


bw = 150©Q and standard deviation o = 50. What 
percentage of the resistors will have resistance between 
148 O and 152 0? Between 140 © and 160 0? 


. The breaking strength X [kg] of a certain type of plastic 


block is normally distributed with a mean of 1500 kg 
and a standard deviation of 50 kg. What is the maximum 
load such that we can expect no more than 5% of the 
blocks to break? 


. If the mathematics scores of the SAT college entrance 


exams are normal with mean 480 and standard deviation 
100 (these are about the actual values over the past 
years) and if some college sets 500 as the minimum 
score for new students, what percent of students would 
not reach that score? 


A producer sells electric bulbs in cartons of 1000 bulbs. 
Using (11), find the probability that any given carton 
contains not more than 1% defective bulbs, assuming 
the production process to be a Bernoulli experiment 
with p = 1%(= probability that any given bulb will be 
defective). First guess. Then calculate. 
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11. 


12. 


13. 


14. 


If sick-leave time X used by employees of a company 
in one month is (very roughly) normal with mean 1000 
hours and standard deviation 100 hours, how much 
time ¢ should be budgeted for sick leave during the next 
month if ¢ is to be exceeded with probability of only 
20%? 

If the monthly machine repair and maintenance cost X 
in a certain factory is known to be normal with mean 
$12,000 and standard deviation $2000, what is the 
probability that the repair cost for the next month will 
exceed the budgeted amount of $15,000? 

If the resistance X of certain wires in an electrical 
network is normal with mean 0.01 0 and standard 
deviation 0.001 ©, how many of 1000 wires will meet 
the specification that they have resistance between 
0.009 and 0.011 0? 


TEAM PROJECT. Normal Distribution. (a) Derive 
the formulas in (6) and (7) from the appropriate normal 
table. 

(b) Show that ®(—z) = 1 — ®(z). Give an example. 
(c) Find the points of inflection of the curve of (1). 
(d) Considering @7(%) and introducing polar coordi- 
nates in the double integral (a standard trick worth 
remembering), prove 


15. 
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(12) (2) = | eo /2 dy = 1, 


V27 J_.. 


(e) Show that o in (1) is indeed the standard deviation 
of the normal distribution. [Use (12).] 

(f) Bernoulli’s law of large numbers. In an experiment 
let an event A have probability p (0 < p < 1), and let X 
be the number of times A happens in n independent trials. 
Show that for any given € > 0, 


ae Se]-1 
n P| = 


(g) Transformation. If X is normal with mean yp and 
variance o”, show that X* = cyX + co (cy > 0) is 
normal with mean p* =cyu+ cg and variance 
ao = c2o”. 

WRITING PROJECT. Use of Tables, Use of CAS. 
Give a systematic discussion of the use of Tables A7 and 
A8 for obtaining P(X < b), P(X > a), Pia < X <b), 
P(X <c) =k, P(X >c) =k, as well as Pu -c< 
X < y+ c) = k; include simple examples. If you have 
a CAS, describe to what extent it makes the use of those 
tables superfluous; give examples. 


asn7>o, 


24.9 Distributions of Several Random Variables 


Distributions of two or more random variables are of interest for two reasons: 


1. They occur in experiments in which we observe several random variables, for 
example, carbon content X and hardness Y of steel, amount of fertilizer X and yield of 
corm Y, height Xj, weight Xz, and blood pressure X3 of persons, and so on. 


2. They will be needed in the mathematical justification of the methods of statistics in 


Chap. 25. 


In this section we consider two random variables X and Y or, as we also say, a two- 
dimensional random variable (X, Y). For (X, Y) the outcome of a trial is a pair of numbers 
X = x, Y = y, briefly (X, Y) = (x, y), which we can plot as a point in the XY-plane. 

The two-dimensional probability distribution of the random variable (X, Y) is given 


by the distribution function 


(1) ACE) = AO SS oe, 7 SS 9). 


This is the probability that in a trial, X will assume any value not greater than x and in 
the same trial, Y will assume any value not greater than y. This corresponds to the blue 
region in Fig. 522, which extends to —© to the left and below. F(x, y) determines the 
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Y (x, y) 


Fig. 522. Formula (1) 


probability distribution uniquely, because in analogy to formula (2) in Sec. 24.5, that is, 
Pia < X Sb) = F(b) — F(a), we now have for a rectangle (see Prob. 16) 


(2) Play <a X= by, ag = y= bg) = F(by, by) a F(a, bg) _ F(by, da) + F(a, dg). 


As before, in the two-dimensional case we shall also have discrete and continuous 
random variables and distributions. 


Discrete Two-Dimensional Distributions 


In analogy to the case of a single random variable (Sec. 24.5), we call (X, Y) and its 
distribution discrete if (X, Y) can assume only finitely many or at most countably infinitely 
many pairs of values (x1, yz), (X92, ya), *+* with positive probabilities, whereas the probability 
for any domain containing none of those values of (X, Y) is zero. 

Let (x;, yj) be any of those pairs and let P(X = x;, Y = y;) = py (where we admit that 
pi; may be 0 for certain pairs of subscripts i, j). Then we define the probability function 
f(x, y) of (X, Y) by 


(3) fy) = py th x= x,y = yj and t(, y) = 0 otherwise; 
here, i = 1, 2,---andj = 1, 2,--- independently. In analogy to (4), Sec. 24.5, we now have 


for the distribution function the formula 


(4) F@y= > Dd fea y;). 


SX Yj7=y 
Instead of (6) in Sec. 24.5 we now have the condition 


(5) be S(xi yj) = I. 
a Jj 


Two-Dimensional Discrete Distribution 


If we simultaneously toss a dime and a nickel and consider 


X = Number of heads the dime turns up, 


Y = Number of heads the nickel turns up, 


then X and Y can have the values 0 or 1, and the probability function is 


FO, 0) = fd, 0) = f(O, 1) = fd, 1) 3, S(x,y) = 0 otherwise. ia 
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EXAMPLE 2 


Fig. 523. Notion of a two-dimensional distribution 


Continuous Two-Dimensional Distributions 


In analogy to the case of a single random variable (Sec. 24.5) we call (X, Y) and its 
distribution continuous if the corresponding distribution function F(x, y) can be given by 
a double integral 


y x 
(6) F(x, y) = | | F(x, y*) dx* dy* 


—0 ~—0 


whose integrand f, called the density of (X,Y), is nonnegative everywhere, and is 
continuous, possibly except on finitely many curves. 

From (6) we obtain the probability that (X, Y) assume any value in a rectangle (Fig. 523) 
given by the formula 


by by 
(7) Pla <X Sh, ag<YSbdo) = | | S(, y) dx dy. 


dz ay 
Two-Dimensional Uniform Distribution in a Rectangle 
Let R be the rectangle ay < x S By, ag < y S By. The density (see Fig. 524) 
(8) f(x, y) = 1/k if (x, y) is in R, (x, y) = 0 otherwise 


defines the so-called uniform distribution in the rectangle R; here k = (B, — a1)(Bz — ag) is the area of R. 
The distribution function is shown in Fig. 525. a 


Fig. 524. Density function (8) of the Fig. 525. Distribution function of the 
uniform distribution uniform distribution defined by (8) 


Marginal Distributions of a Discrete Distribution 


This is a rather natural idea, without counterpart for a single random variable. It amounts 
to being interested only in one of the two variables in (X, Y), say, X, and asking for its 
distribution, called the marginal distribution of X in (X, Y). So we ask for the probability 
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P(X = x, Y arbitrary). Since (X, Y) is discrete, so is X. We get its probability function, 
call it fj(x), from the probability function f(x, y) of (X, Y) by summing over y-: 


(9) fa) = PX = x, Y arbitrary) = 5S) fG, y) 
y 


where we sum all the values of f(x, y) that are not 0 for that x. 
From (9) we see that the distribution function of the marginal distribution of X is 


(10) Fy(x) = P(X Sx, Y arbitrary) = > Silx*). 


Ct=x 


Similarly, the probability function 


(11) fo(y) = P(X arbitrary, Y = y) = Sf y) 


x 


determines the marginal distribution of Y in (X, Y). Here we sum all the values of f(x, y) that 
are not zero for the corresponding y. The distribution function of this marginal distribution is 


(12) Fo(y) = P(X arbitrary, Y S$ y) = y fo(y*). 
y* Sy 
Marginal Distributions of a Discrete Two-Dimensional Random Variable 
In drawing 3 cards with replacement from a bridge deck let us consider 
(X, Y), X = Number of queens, Y = Number of kings or aces. 


The deck has 52 cards. These include 4 queens, 4 kings, and 4 aces. Hence in a single trial a queen has probability 
4 = i and a king or ace & = 4. This gives the probability function of (X, Y), 


rapa ANT —— wenas 
x,y Xx s 
° xlyl(3 — x — y)! \13 13 13 2 


and f(x, y) = 0 otherwise. Table 24.1 shows in the center the values of f(x, y) and on the right and lower margins 
the values of the probability functions f;(x) and fo(y) of the marginal distributions of X and Y, respectively. 


Table 24.1 Values of the Probability Functions f(x, y), f,(x), f(y) in Drawing 
Three Cards with Replacement from a Bridge Deck, where X is the Number 
of Queens Drawn and Y is the Number of Kings or Aces Drawn 


ae 0 1 2 3 f(x) 
0 1000 600 120 8 1728 
2197 2197 2197 2197 2197 
1 300 120 12 0 432 
2197 2197 2197 197 
30 6 36 
2 2197 2197 0 0 2197 
1 1 
3 2197 0 0 0 2197 
f. ( ) 1331 726 132 8 
2 x 2197 2197 2197 2197 
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Marginal Distributions of a Continuous Distribution 


This is conceptually the same as for discrete distributions, with probability functions and 
sums replaced by densities and integrals. For a continuous random variable (X, Y) with 
density f(x, y) we now have the marginal distribution of X in (X, Y), defined by the 
distribution function 


x 
(13) Fix) = PX Sx,-%™ < Y<m)= | Silx*) dx* 
with the density f; of X obtained from f(x, y) by integration over y, 
(14) Ai@) = | F(x, y) dy. 
Interchanging the roles of X and Y, we obtain the marginal distribution of Y in (X, Y) 
with the distribution function 
y 
(15) Fa) =e Ae Vey) = | fa(y*) dy* 
and density 


6) Poe | fend 


—7% 


Independence of Random Variables 


X and Y in a (discrete or continuous) random variable (X, Y) are said to be independent if 
(17) F(x, y) = FiQ)F2Q) 


holds for all (x, y). Otherwise these random variables are said to be dependent. These 
definitions are suggested by the corresponding definitions for events in Sec. 24.3. 
Necessary and sufficient for independence is 


(18) Sy) = AC@pfaY) 


for all x and y. Here the f’s are the above probability functions if (X, Y) is discrete or 
those densities if (X, Y) is continuous. (See Prob. 20.) 


Independence and Dependence 


In tossing a dime and a nickel, X = Number of heads on the dime, Y = Number of heads on the nickel may 
assume the values 0 or | and are independent. The random variables in Table 24.1 are dependent. 
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Extension of Independence to n-Dimensional Random Variables. This will be needed 
throughout Chap. 25. The distribution of such a random variable X = (Xj,---, X,) is 
determined by a distribution function of the form 


F@1,°*', Xn) = P(X Sx1,°°°, Xn S Xn). 


The random variables Xj,---, X, are said to be independent if 
(19) F(X4,°°+ Xn) = F(X 1)F2(%2) °°: Fn(Xn) 
for all (x1, +++, Xn). Here F;(x;) is the distribution function of the marginal distribution of 


X; in X, that is, 
F(xj) = P(X} = xj, X_ arbitrary, k = 1,---,n,k # j). 


Otherwise these random variables are said to be dependent. 


Functions of Random Variables 


When n = 2, we write X1 = X, Xp = Y,x1 = x, x2 = y. Taking a nonconstant continuous 
function g(x, y) defined for all x, y, we obtain a random variable Z = g(X, Y). For example, 
if we roll two dice and X and Y are the numbers the dice turn up in a trial, then Z = X + Y 
is the sum of those two numbers (see Fig. 514 in Sec. 24.5). 

In the case of a discrete random variable (X, Y) we may obtain the probability function 
f(z) of Z = g(X, Y) by summing all f(x, y) for which g(x, y) equals the value of z 
considered; thus 


(20) f2=PZ=9= D> fy). 


gx,y=z 


Hence the distribution function of Z is 


(21) F@) = PZED= D> fey) 


g(x,y) Sz 


where we sum all values of f(x, y) for which g(x, y) S z. 
In the case of a continuous random variable (X, Y) we similarly have 


(22) F(z) = P(Z=Zz) = || F(x, y) dx dy 
g(x,y) Sz 


where for each z we integrate the density f(x, y) of (X, Y) over the region g(x, y) = z in 
the xy-plane, the boundary curve of this region being g(x, y) = z. 


SEC. 24.9 Distributions of Several Random Variables 1057 


THEOREM 1 


THEROEM 2 


PROOF 


Addition of Means 


The number 


DY DY s& My) [(X, Y) discrete] 
(23)  E(g(X, Y)) = . : 
| | a(x, y) f(x, y) dx dy [(X, Y) continuous] 


is called the mathematical expectation or, briefly, the expectation of g(X, Y). Here it is 
assumed that the double series converges absolutely and the integral of |g(x, y)|f(x, y) 
over the xy-plane exists (is finite). Since summation and integration are linear processes, 
we have from (23) 

(24) E(ag(X, Y) + bh(X, Y)) = aE(g(X, Y)) + bE(H(X, Y)). 

An important special case is 


E(X + Y) = E(X) + E(Y), 


and by induction we have the following result. 


Addition of Means 


The mean (expectation) of a sum of random variables equals the sum of the means 
(expectations), that is, 


(25) E(Xy a Xo qr Peo Sp Gp) = E(X4) <P E(X2) ap ooo 4p OX) 


Furthermore, we readily obtain 


Multiplication of Means 


The mean (expectation) of the product of independent random variables equals the 
product of the means (expectations), that is, 


(26) E(X1X2°+* Xn) = E(X1)E(X2) ++: E(Xn). 


If X and Y are independent random variables (both discrete or both continuous), then 
E(XY) = E(X)E(Y). In fact, in the discrete case we have 


E(XY) = > > wf y) = S Ai@ Dd yaO) = EXE), 
x y y 


x 
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and in the continuous case the proof of the relation is similar. Extension to n independent 
random variables gives (26), and Theorem 2 is proved. a 


Addition of Variances 


This is another matter of practical importance that we shall need. As before, let Z = X + Y 
and denote the mean and variance of Z by w and o”. Then we first have (see Team Project 
20(a) in Problem Set 24.6) 


o” = E((Z — pl’) = E(Z”) - [EZ)P. 


From (24) we see that the first term on the right equals 
E(Z?) = E(X? + 2XY + Y*) = E(X?) + 2E(XY) + E(Y?). 
For the second term on the right we obtain from Theorem | 
[E(Z)? = [E(X) + EY)? = [BX)P + 2BX)EY) + EYP. 
By substituting these expressions into the formula for o” we have 


o” = E(X*) — [E(X)P? + EY?) — [E(Y)P 
+ 2[E(XY) — E(X)E(Y)]. 
From Team Project 20, Sec. 24.6, we see that the expression in the first line on the right 


is the sum of the variances of X and Y, which we denote by of and 03, respectively. The 
quantity in the second line (except for the factor 2) is 


(27) Oxy = E(XY) — E(XX)E(Y) 


and is called the covariance of X and Y. Consequently, our result is 
(28) o* = of + of + 2cxy. 
If X and Y are independent, then 
E(XY) = E(X)E(Y); 
hence oxy = 0, and 
(29) o” = 0% + 0%. 


Extension to more than two variables gives the basic 


Addition of Variances 


The variance of the sum of independent random variables equals the sum of the 
variances of these variables. 


SEC. 24.9 Distributions of Several Random Variables 


1. 


FS, y) = 625 


9. 


10. 


11. 


CAUTION! 
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In the numerous applications of Theorems | and 3 we must always 


remember that Theorem 3 holds only for independent variables. 

This is the end of Chap. 24 on probability theory. Most of the concepts, methods, and 
special distributions discussed in this chapter will play a fundamental role in the next 
chapter, which deals with methods of statistical inference, that is, conclusions from 
samples to populations, whose unknown properties we want to know and try to discover 
by looking at suitable properties of samples that we have obtained. 


PROBLEM SET 247-9 


Let f(x, y) = k when 8 Sx S$ 12 andO0 Sy $2 and 
zero elsewhere. Find k. Find P(X S 11,1 S Y $1.5) 
and P9 SX 5S 13,Y=1). 


. Find P(X > 4, ¥ > 4) and P(X = 1, YS 1) if (X,Y) 


has the density f(x, y) = 3p if x 20,y20,x+yS8. 


. Let f(x, y) = kifx > 0,y > 0,x + y < 3 and 0 other- 


wise. Find k. Sketch f(x, y). Find P(X + YS 1),PY>X). 


. Find the density of the marginal distribution of X in 


Prob. 2. 


. Find the density of the marginal distribution of Y in 


Fig. 524. 


. If certain sheets of wrapping paper have a mean weight 


of 10 g each, with a standard deviation of 0.05 g, what 
are the mean weight and standard deviation of a pack 
of 10,000 sheets? 


. What are the mean thickness and the standard deviation 


of transformer cores each consisting of 50 layers of 
sheet metal and 49 insulating paper layers if the metal 
sheets have mean thickness 0.5 mm each with a 
standard deviation of 0.05 mm and the paper layers 
have mean 0.05 mm each with a standard deviation of 
0.02 mm? 


. Let X [cm] and Y [cm] be the diameters of a pin and 


hole, respectively. Suppose that (X, Y) has the density 


if 0.98<x< 1.02, 100<y< 1.04 


and 0 otherwise. (a) Find the marginal distributions. 
(b) What is the probability that a pin chosen at random 
will fit a hole whose diameter is 1.00? 

Using Theorems | and 3, obtain the formulas for the 
mean and the variance of the binomial distribution. 


Using Theorem 1, obtain the formula for the mean of 
the hypergeometric distribution. Can you use Theorem 
3 to obtain the variance of that distribution? 


A 5-gear assembly is put together with spacers between 
the gears. The mean thickness of the gears is 5.020 cm 
with a standard deviation of 0.003 cm. The mean 
thickness of the spacers is 0.040 cm with a standard 
deviation of 0.002 cm. Find the mean and standard 
deviation of the assembled units consisting of 5 randomly 
selected gears and 4 randomly selected spacers. 


12. 


13. 


14. 


15. 


16. 
17. 


18. 


19. 


20. 


If the mean weight of certain (empty) containers is 5 lb 
the standard deviation is 0.2 Ib, and if the filling of the 
containers has mean weight 100 Ib and standard 
deviation 0.5 lb, what are the mean weight and the 
standard deviation of filled containers? 


Find P(X > Y) when (X, Y) has the density 
f(xy) = 0.25e° °°" *Y if x =0,yZ0 


and 0 otherwise. 


An electronic device consists of two components. Let 
X and Y [years] be the times to failure of the first and 
second components, respectively. Assume that (X, Y) 
has the density f(x, y) = 4e72°* if x > O and y > 0 
and 0 otherwise. (a) Are X and Y dependent or 
independent? (b) Find the densities of the marginal 
distributions. (c) What is the probability that the first 
component will have a lifetime of 2 years or longer? 
Give an example of two different discrete distributions 
that have the same marginal distributions. 

Prove (2). 

Let (X, Y) have the probability function 


FO, 0) =f, D = 
FO, 1) = fC, 0) = 


Are X and Y independent? 
Let (X, Y) have the density 


co|09 Coles 


f(y) = kifx? + y?<1 
and 0 otherwise. Determine k. Find the densities of the 
marginal distributions. Find the probability 
P(X? + 2 <b. 
Show that the random variables with the densities 


f@y=xty 
and 


g(x,y) = (x + 3) + 3) 


if OSx51,0SyS1 and f(,y)=0 and 
g(x,y) =0 elsewhere, have the same marginal 
distribution. 


Prove the statement involving (18). 
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CHAPTER 24 REVIEW QUESTIONS AND PROBLEMS 


1. 


What are stem-and-leaf plots? Boxplots? Histograms? 
Compare their advantages. 


. What properties of data are measured by the mean? The 


median? The standard deviation? The variance? 


. What do we mean by an experiment? An outcome? An 


event? Give examples. 


. What is a random variable? Its distribution function? 


Its probability function or density? 


. State the definition of probability from memory. Give 


simple examples. 


. What is sampling with and without replacement? What 


distributions are involved? 


. When is the Poisson distribution a good approximation 


of the binomial distribution? The normal distribution? 


. Explain the use of the tables of the normal distribution. 


If you have a CAS, how would you proceed without 
the tables? 


. State the main theorems on probability. Illustrate them 


by simple examples. 


. State the most important facts about distributions of 


two random variables and their marginal distributions. 


. Make a stem-and-leaf plot, histogram, and boxplot of the 


data 110, 113, 109, 118, 110, 115, 104, 111, 116, 113. 


. Same task as in Prob. 11. for the data 13.5, 13.2, 12.1, 


13.6, 13.3. 


. Find the mean, standard deviation, and variance in 


Prob. 11. 


. Find the mean, standard deviation, and variance in 


Prob. 12. 


SUMMARY—OF- CHAPTER 24 


15 


16 


17. 


18. 


19. 


20. 


21. 


22. 


23. 


24. 


25. 


. Show that the mean always lies between the smallest 
and the largest data value. 

. What are the outcomes in the sample space of the 

experiment of simultaneously tossing three coins? 

Plot a histogram of the data 8, 2, 4, 10 and guess x and s 

by inspecting the histogram. Then calculate x, s”, and s. 

Using a Venn diagram, show that A C B if and only if 

ANB=A., 

Suppose that 3% of bolts made by a machine are 

defective, the defectives occurring at random during 

production. If the bolts are packaged 50 per box, what 

is the binomial approximation of the probability that a 

given box will contain x = 0, 1,---,5 defectives? 

Of a lot of 12 items, 3 are defective. (a) Find the number 

of different samples of 3 items. Find the number of 

samples of 3 items containing (b) no defectives, (c) 1 

defective, (d) 2 defectives, (e) 3 defectives. 

Find the probability function of ¥ = Number of times 

of tossing a fair coin until the first head appears. 

If the life of ball bearings has the density f(x) = ke~” 

if 0 = x S 2 and O otherwise, what is k? What is the 

probability P(X = 1)? 

Find the mean and variance of a discrete random variable 

X having the probability function f(0) = 4, fQ) = 3, 

FQ) = 4. 

Let X be normal with mean 14 and variance 4. Determine 

c such that P(X Sc) =95%, P(X Sc) =5%, 

P(X Sc) = 99.5%. 

Let X be normal with mean 80 and variance 9. Find 

P(X > 83), P(X < 81), P(X < 80), and P(78 < X < 82). 


Data Analysis. Probability Theory 


() 


A random experiment, briefly called experiment, is a process in which the result 
(“outcome’’) depends on “chance” (effects of factors unknown to us). Examples are 
games of chance with dice or cards, measuring the hardness of steel, observing weather 
conditions, or recording the number of accidents in a city. (Thus the word “experiment” 
is used here in a much wider sense than in common language.) The outcomes are 
regarded as points (elements) of a set S, called the sample space, whose subsets are 
called events. For events E we define a probability P(E) by the axioms (Sec. 24.3) 


OS P(E)=1 


P(E, U Ey U «++) = P(E) + P(Ex) + + 


P(S) = 1 
(Ej N Ex = @). 


Summary of Chapter 24 1061 


These axioms are motivated by properties of frequency distributions of data 
(Sec. 24.1). 
The complement E° of E has the probability 


(2) P(ES) = 1 — P(E). 


The conditional probability of an event B under the condition that an event A 
happens is (Sec. 24.3) 


P(A 1B) 
(3) P(BIA) = PA) [P(A) > 0]. 


Two events A and B are called independent if the probability of their simultaneous 
appearance in a trial equals the product of their probabilities, that is, if 


(4) P(A 1 B) = P(A)P(B). 


With an experiment we associate a random variable X. This is a function defined 
on S whose values are real numbers; furthermore, X is such that the probability 
P(X = a) with which X assumes any value a, and the probability Pia < X S b) with 
which X assumes any value in an interval a < X S b are defined (Sec. 24.5). The 
probability distribution of X is determined by the distribution function 


(5) F(x) = P(X =x). 


In applications there are two important kinds of random variables: those of the 
discrete type, which appear if we count (defective items, customers in a bank, etc.) 
and those of the continuous type, which appear if we measure (length, speed, 
temperature, weight, etc.). 

A discrete random variable has a probability function 


(6) f(x) = P(X = x). 


Its mean pw and variance o” are (Sec. 24.6) 
(7) w= Di xifxs) and a = Sx — wy Fs) 
g J 


where the x; are the values for which X has a positive probability. Important discrete 
random variables and distributions are the binomial, Poisson, and hypergeometric 
distributions discussed in Sec. 24.7. 

A continuous random variable has a density 


(8) f@) = F’@) [see (5)]. 


Its mean and variance are (Sec. 24.6) 


(9) b= | xf(x) dx and c= | (x — p)*f(x) dx. 


—2 —7 
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Very important is the normal distribution (Sec. 24.8), whose density is 


2 
= 1 1 2 
(10) i aie os| ‘( ; 


and whose distribution function is (Sec. 24.8; Tables A7, A8 in App. 5) 
x= jb 
(11) F@) = o(), 


A two-dimensional random variable (X, Y) occurs if we simultaneously observe 
two quantities (for example, height X and weight Y of adults). Its distribution function 
is (Sec. 24.9) 

(12) F(x, y) = PX Sx, Y Sy). 
X and Y have the distribution functions (Sec. 24.9) 


(13) Fy(x) = P(X S x, Y arbitrary) and Fo(y) = P(x arbitrary, Y S y) 


respectively; their distributions are called marginal distributions. If both X and Y 
are discrete, then (X, Y) has a probability function 


f(x, y) = PX = x, Y= y). 


If both X and Y are continuous, then (X, Y) has a density f(x, y). 


CHAPTER 2 5 


d, 


——a a 


Mathematical Statistics 


In probability theory we set up mathematical models of processes that are affected by 
“chance.” In mathematical statistics or, briefly, statistics, we check these models against 
the observable reality. This is called statistical inference. It is done by sampling, that 
is, by drawing random samples, briefly called samples. These are sets of values from a 
much larger set of values that could be studied, called the population. An example is 
10 diameters of screws drawn from a large lot of screws. Sampling is done in order to 
see whether a model of the population is accurate enough for practical purposes. If this 
is the case, the model can be used for predictions, decisions, and actions, for instance, in 
planning productions, buying equipment, investing in business projects, and so on. 

Most important methods of statistical inference are estimation of parameters (Secs. 25.2), 
determination of confidence intervals (Sec. 25.3), and hypothesis testing (Sec. 25.4, 25.7, 
25.8), with application to quality control (Sec. 25.5) and acceptance sampling (Sec. 25.6). 

In the last section (25.9) we give an introduction to regression and correlation analysis, 
which concern experiments involving two variables. 


Prerequisite: Chap. 24. 
Sections that may be omitted in a shorter course: 25.5, 25.6, 25.8. 
References, Answers to Problems, and Statistical Tables: App. 1 Part G, App. 2, App. 5. 


25.1 Introduction. Random Sampling 


Mathematical statistics consists of methods for designing and evaluating random 
experiments to obtain information about practical problems, such as exploring the relation 
between iron content and density of iron ore, the quality of raw material or manufactured 
products, the efficiency of air-conditioning systems, the performance of certain cars, the 
effect of advertising, the reactions of consumers to a new product, etc. 

Random variables occur more frequently in engineering (and elsewhere) than one 
would think. For example, properties of mass-produced articles (screws, lightbulbs, etc.) 
always show random variation, due to small (uncontrollable!) differences in raw material 
or manufacturing processes. Thus the diameter of screws is a random variable X and we 
have nondefective screws, with diameter between given tolerance limits, and defective 
screws, with diameter outside those limits. We can ask for the distribution of X, for the 
percentage of defective screws to be expected, and for necessary improvements of the 
production process. 

Samples are selected from populations—20 screws from a lot of 1000, 100 of 5000 
voters, 8 beavers in a wildlife conservation project—because inspecting the entire 
population would be too expensive, time-consuming, impossible or even senseless (think 
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of destructive testing of lightbulbs or dynamite). To obtain meaningful conclusions, 
samples must be random selections. Each of the 1000 screws must have the same chance 
of being sampled (of being drawn when we sample), at least approximately. Only then 
will the sample mean X = (x1 + +++ + x29)/20 (Sec. 24.1) of a sample of size n = 20 
(or any other 2) be a good approximation of the population mean p (Sec. 24.6); and the 
accuracy of the approximation will generally improve with increasing n, as we shall see. 
Similarly for other parameters (standard deviation, variance, etc.). 

Independent sample values will be obtained in experiments with an infinite sample 
space S (Sec. 24.2), certainly for the normal distribution. This is also true in sampling with 
replacement. It is approximately true in drawing small samples from a large finite population 
(for instance, 5 or 10 of 1000 items). However, if we sample without replacement from a 
small population, the effect of dependence of sample values may be considerable. 

Random numbers help in obtaining samples that are in fact random selections. This 
is sometimes not easy to accomplish because there are many subtle factors that can bias 
sampling (by personal interviews, by poorly working machines, by the choice of 
nontypical observation conditions, etc.). Random numbers can be obtained from a 
random number generator in Maple, Mathematica, or other systems listed on p. 789. 
(The numbers are not truly random, as they would be produced in flipping coins or 
rolling dice, but are calculated by a tricky formula that produces numbers that do have 
practically all the essential features of true randomness. Because these numbers 
eventually repeat, they must not be used in cryptography, for example, where true 
randomness is required.) 


Random Numbers from a Random Number Generator 


To select a sample of size n = 10 from 80 given ball bearings, we number the bearings from | to 80. We then 
let the generator randomly produce 10 of the integers from 1 to 80 and include the bearings with the numbers 
obtained in our sample, for example. 


44 55 53 03 52 61 67 78 39 54 


or whatever. 
Random numbers are also contained in (older) statistical tables. B 


Representing and processing data were considered in Sec. 24.1 in connection with 
frequency distributions. These are the empirical counterparts of probability distributions 
and helped motivating axioms and properties in probability theory. The new aspect in this 
chapter is randomness: the data are samples selected randomly from a population. 
Accordingly, we can immediately make the connection to Sec. 24.1, using stem-and-leaf 
plots, box plots, and histograms for representing samples graphically. 

Also, we now call the mean x in (5), Sec. 24.1, the sample mean 


- i= 1 
(1) hy Ot a a, 


Pol 


We call n the sample size, the variance s* in (6), Sec. 24.1, the sample variance 


= 1 
(2) = D 65 — 3)? = — [a — 3 + + On — 1)" 
jel 
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and its positive square root s the sample standard deviation. x, s”, and s are called 
parameters of a sample; they will be needed throughout this chapter. 


25.2 Point Estimation of Parameters 


Beginning in this section, we shall discuss the most basic practical tasks in statistics and 
corresponding statistical methods to accomplish them. The first of them is point estimation 
of parameters, that is, of quantities appearing in distributions, such as p in the binomial 
distribution and yw and o in the normal distribution. 

A point estimate of a parameter is a number (point on the real line), which is computed 
from a given sample and serves as an approximation of the unknown exact value of the 
parameter of the population. An interval estimate is an interval (“confidence interval”) 
obtained from a sample; such estimates will be considered in the next section. Estimation 
of parameters is of great practical importance in many applications. 

As an approximation of the mean yw of a population we may take the mean x of a 
corresponding sample. This gives the estimate & = x for p, that is, 


) == Fe to ta) 


where n is the sample size. Similarly, an estimate 6 for the variance of a population is 
the variance s” of a corresponding sample, that is, 


(2) G2 = 2 = S (xj — 2. 


Clearly, (1) and (2) are estimates of parameters for distributions in which pw or o 


appear explicity as parameters, such as the normal and Poisson distributions. For the 
binomial distribution, p = y/n [see (3) in Sec. 24.7]. From (1) we thus obtain for p 
the estimate 


(3) 


» 
II 
Sle 


We mention that (1) is a special case of the so-called method of moments. In this 
method the parameters to be estimated are expressed in terms of the moments of the 
distribution (see Sec. 24.6). In the resulting formulas, those moments of the distribution 
are replaced by the corresponding moments of the sample. This gives the estimates. Here 
the kth moment of a sample xj, ---, x, is 


1066 


EXAMPLE 1 


CHAP. 25 Mathematical Statistics 


Maximum Likelihood Method 


Another method for obtaining estimates is the so-called maximum likelihood method of 
R. A. Fisher [Messenger Math. 41 (1912), 155-160]. To explain it, we consider a discrete 
(or continuous) random variable X whose probability function (or density) f(x) depends 
on a single parameter 6. We take a corresponding sample of n independent values 


X1,°'',Xn. Then in the discrete case the probability that a sample of size n consists 


precisely of those n values is 
(4) L= fr) f(%2) +++ fn). 


In the continuous case the probability that the sample consists of values in the small 
intervals x; S x Sx; + Ax(j = 1,2,---,n) is 


(5) flrAx fir) Ax ++ frp)Ax = (Ax)”. 


Since f(x;) depends on 6, the function / in (5) given by (4) depends on x1,---, x, and 0. 
We imagine x1,°-*, xX, to be given and fixed. Then / is a function of 6, which is called 
the likelihood function. The basic idea of the maximum likelihood method is quite simple, 
as follows. We choose that approximation for the unknown value of 6 for which / is as 
large as possible. If / is a differentiable function of 0, a necessary condition for / to have 
a maximum in an interval (not at the boundary) is 


ol 
6 —= 0. 
(6) 30 
(We write a partial derivative, because / depends also on x1,---,x,.) A solution of (6) 
depending on x1,---, x, is called a maximum likelihood estimate for 6. We may replace 
(6) by 

dlnl 

7 = 0, 
(7) 36 


because f(x;) > 0, a maximum of / is in general positive, and In / is a monotone increasing 
function of /. This often simplifies calculations. 


Several Parameters. If the distribution of X involves r parameters 01, --- , 6,, then instead 
of (6) we have the r conditions 0//060; = 0,---, 01/06, = 0, and instead of (7) we have 


dlnl dlnl 
8 = 0, ity = 0. 
(8) 561 


Normal Distribution 
Find maximum likelihood estimates for 0; = and 02 = @ in the case of the normal distribution. 


Solution. From (1), Sec. 24.8, and (4) we obtain the likelihood function 


1 nm 1 n 1 n 
- (Fa) (;) ee Khe a 2 Ue 


j=l 
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Taking logarithms, we have 
In/ = —naln V27 — ning — h. 


The first equation in (8) is dUn J)/du = 0, written out 


dlnl oh 
Om Om 


hence xj; — np = 0. 


Ms 


q 
& 
Il 
A 
& 
I 
ul 


The solution is the desired estimate f for ww: we find 
1 n 
if = n > Xj =x. 
j=l 
The second equation in (8) is d(In J)/do = 0, written out 


dInl n oh 1 
lon o 


1 
+ yey =p =o 


Replacing pw by ~ and solving for o”, we obtain the estimate 


12 
.=— S (aj — x)? 
j=l 


which we shall use in Sec. 25.7. Note that this differs from (2). We cannot discuss criteria for the goodness of 
estimates but want to mention that for small n, formula (2) is preferable. ia 


PROBLEEM—SET 25-2 


1. 


Normal distribution. Apply the maximum likelihood 
method to the normal distribution with ~ = 0. 


. Find the maximum likelihood estimate for the 


parameter y of a normal distribution with known 
variance 0” = oe = 16. 


. Poisson distribution. Derive the maximum likelihood 


estimator for jz. Apply it to the sample (10, 25, 26, 17, 
10, 4), giving numbers of minutes with 0-10, 11-20, 
21-30, 31-40, 41-50, more than 50 fliers per minute, 
respectively, checking in at some airport check-in. 


. Uniform distribution. Show that, in the case of the 


parameters a and b of the uniform distribution (see 
Sec. 24.6), the maximum likelihood estimate cannot be 
obtained by equating the first derivative to zero. How 
can we obtain maximum likelihood estimates in this 
case, more or less by using common sense? 


. Binomial distribution. Derive a maximum likelihood 


estimate for p. 


. Extend Prob. 5 as follows. Suppose that m times n trials 


were made and in the first n trials A happened k, times, 
in the second n trials A happened kg times, ---, in the 
mth n trials A happened k,, times. Find a maximum 
likelihood estimate of p based on this information. 


7. 


10. 


11. 


12. 


Suppose that in Prob. 6 we made 3 times 4 trials and 
A happened 2, 3, 2 times, respectively. Estimate p. 


. Geometric distribution. Let X = Number of inde- 


pendent trials until an event A occurs. Show that X has 
a geometric distribution, defined by the probability 
function f(x) = pqr yx = 1,2,---, where p is the 
probability of A in a single trial and g = 1 — p. Find 
the maximum likelihood estimate of p corresponding to 
a sample x1, X9,°**, 2X, of observed values of X. 


. In Prob. 8, show that f(1) + f(2) +--: = 1 (as it 


should be!). Calculate independently of Prob. 8 the 
maximum likelihood of p in Prob. 8 corresponding to 
a single observed value of X. 


In rolling a die, suppose that we get the first “Six” in 
the 7th trial and in doing it again we get it in the 6th 
trial. Estimate the probability p of getting a “Six” in 
rolling that die once. 

Find the maximum likelihood estimate of @ in the 
density f(x) = 6e~° if x = 0 and f(x) = O if x < 0. 
In Prob. 11, find the mean yp, substitute it in f(x), find 
the maximum likelihood estimate of , and show that 
it is identical with the estimate for w which can be 
obtained from that for 6 in Prob. 11. 
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13. Compute 6 in Prob. 11 from the sample 1.9, 0.4, 0.7, 0.6, 15. CAS EXPERIMENT. Maximum Likelihood 


1.4. Graph the sample distribution function F(x) and the Estimates. (MLEs). Find experimentally how much 
distribution function F(x) of the random variable, with MLEs can differ depending on the sample size. Hint. 
0 = 6, on the same axes. Do they agree reasonably well? Generate many samples of the same size n, e.g., of the 
(We consider goodness of fit systematically in Sec. 25.7.) standardized normal distribution, and record X and s?. 


14. Do the same task as in Prob. 13 if the given sample is 


Then increase n. 


0.4, 0.7, 0.2, 1.1, 0.1. 


25.3 Confidence Intervals 


Confidence intervals’ for an unknown parameter 6 of some distribution (e.g., 9 = j2) are 
intervals 0; = 0 S Og that contain 6, not with certainty but with a high probability y, 
which we can choose (95% and 99% are popular). Such an interval is calculated from a 
sample. y = 95% means probability | — y = 5% = = of being wrong—one of about 
20 such intervals will not contain 0. Instead of writing 0; = 6 S 02, we denote this more 
distinctly by writing 


(1) CONF, {01 = 0 S 69}. 


Such a special symbol, CONF, seems worthwhile in order to avoid the misunderstanding 
that 0 must lie between 6, and 69. 

y is called the confidence level, and 0; and 62 are called the lower and upper 
confidence limits. They depend on y. The larger we choose y, the smaller is the error 
probability 1 — y, but the longer is the confidence interval. If y — 1, then its length goes 
to infinity. The choice of y depends on the kind of application. In taking no umbrella, a 
5% chance of getting wet is not tragic. In a medical decision of life or death, a 5% chance 
of being wrong may be too large and a 1% chance of being wrong (y = 99%) may be 
more desirable. 

Confidence intervals are more valuable than point estimates (Sec. 25.2). Indeed, we can 
take the midpoint of (1) as an approximation of 6 and half the length of (1) as an “error bound” 
(not in the strict sense of numerics, but except for an error whose probability we know). 

61 and 6 in (1) are calculated from a sample x1, ---,x,. These are n observations of a 
random variable X. Now comes a standard trick. We regard x1,°-:,X, as single 
observations of n random variables X1,:+++, Xn (with the same distribution, namely, that 
of X). Then 01 = 04(%1,°++, X,) and @2 = O2(x1,°°-, X,) in (1) are observed values of two 
random variables 0, = 04(Xj,:°:,X,) and Og = Oo(Xj,°::,X,). The condition (1) 
involving y can now be written 


(2) P(Q; £9 = Oy) = ¥. 


Let us see what all this means in concrete practical cases. 

In each case in this section we shall first state the steps of obtaining a confidence interval 
in the form of a table, then consider a typical example, and finally justify those steps 
theoretically. 


1JERZY NEYMAN (1894-1981), American statistician, developed the theory of confidence intervals (Annals 
of Mathematical Statistics 6 (1935), 111-116). 
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Confidence Interval for 4 of the Normal Distribution 
with Known o7 


Table 25.1 Determination of a Confidence Interval for the Mean 
of a Normal Distribution with Known Variance o7 


Step I. Choose a confidence level y (95%, 99%, or the like). 


Step 2. Determine the corresponding c: 


Y | 0.90 0.95 0.99 0.999 


c | 1.645 1.960 2.576 3.291 


Step 3. Compute the mean x of the sample x1,---, Xp. 


Step 4. Compute k = ca/ Vn. The confidence interval for pu is 


(3) CONF, {x ~kKSpSxt k}. 


EXAMPLE 1. Confidence Interval for 1 of the Normal Distribution with Known o” 
Determine a 95% confidence interval for the mean of a normal distribution with variance 0? = 9, using a sample 
of n = 100 values with mean x = 5. 


Solution. Step 1. y = 0.95 is required. Step 2. The corresponding c equals 1.960; see Table 25.1. 
Step 3.X = 5isgiven. Step 4. We need k = 1.960 - 3/100 = 0.588. Hence ¥ — k = 4.412,% + k = 5.588 
and the confidence interval is CONF9 95 {4.412 S p S 5.588}. 
This is sometimes written 4 = 5 + 0.588, but we shall not use this notation, which can be misleading. 
With your CAS you can determine this interval more directly. Similarly for the other examples in this section. 


Theory for Table 25.1. The method in Table 25.1 follows from the basic 


THEOREM 1 Sum of Independent Normal Random Variables 


Let Xy,:++, Xy be independent normal random variables each of which has mean 
band variance o”. Then the following holds. 


(a) The sum X, +--+ + X, is normal with mean np and variance no. 


(b) The following random variable X is normal with mean mw and variance o”/ n. 
= 1 
(4) Cn x 


(c) The following random variable Z is normal with mean 0 and variance |. 


— o/Vn 


(5) Z 
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The statements about the mean and variance in (a) follow from Theorems | and 3 in 
Sec. 24.9. From this, and Theorem 2 in Sec. 24.6, we see that X has the mean (1/n)np = » 
and the variance (1/ n)’no” = o*/ n. This implies that Z has the mean 0 and variance 1, 
by Theorem 2(b) in Sec. 24.6. The normality of X; + --- + X, is proved in Ref. [G3] 
listed in App. 1. This implies the normality of (4) and (5). oH 


Derivation of (3) in Table 25.1. Sampling from a normal distribution gives independent 
sample values (see Sec. 25.1), so that Theorem | applies. Hence we can choose y and 
then determine c such that 


6 P faa = P( -—* 20)=0 ri) = 
(6) (-e SZ 0) o> (c) (—e) = ¥. 


For the value y = 0.95 we obtain z(D) = 1.960 from Table A8 in App. 5, as used in 
Example 1. For y = 0.9, 0.99, 0.999 we get the other values of c listed in Table 25.1. 
Finally, all we have to do is to convert the inequality in (6) into one for mw and insert 
observed values obtained from the sample. We multiply —c S Z = c by —1 and then by 
o/Vn, writing ca/Vn = k (as in Table 25.1), 


= ain Cc 


PecSZe¢c) = Pi(e=-Ze ¢ = P(e 


Adding X gives (PX +k2u~2X-h=y or 
(7) PX -—kSpsxX+h=y. 


Inserting the observed value X of X gives (3). Here we have regarded x1, +++, Xn as single 
observations of Xj, ---, Xp (the standard trick!), so that x1 + +++ + x, is an observed value 
of X; + -:: + X,, and X is an observed value of X. Note further that (7) is of the form (2) 


with 0, = X —kand @. =X +k. a 


Sample Size Needed for a Confidence Interval of Prescribed Length 
How large must n be in Example | if we want to obtain a 95% confidence interval of length L = 0.4? 


Solution. The interval (3) has the length L = 2k = 2co/Vn. Solving for n, we obtain 
n = (2co/L)*. 


In the present case the answer is n = (2+ 1.960 - 3/0.4)" = 870. 
Figure 526 shows how L decreases as n increases and that for y = 99% the confidence interval is substantially 
longer than for y = 95% (and the same sample size n). 
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0.6 


0.4 


Lio 


0.2 


95 500 


n 


Fig. 526. Length of the confidence interval (3) (measured in multiples of o) 
as a function of the sample size n for y = 95% and y = 99% 


Confidence Interval for of the Normal Distribution 
with Unknown o7 


In practice o” is frequently unknown. Then the method in Table 25.1 does not help and 
the whole theory changes, although the steps of determining a confidence interval for w 
remain quite similar. They are shown in Table 25.2. We see that k differs from that in 
Table 25.1, namely, the sample standard deviation s has taken the place of the unknown 
standard deviation o of the population. And c now depends on the sample size n and must 
be determined from Table A9 in App. 5 or from your CAS. That table lists values z for 
given values of the distribution function (Fig. 527) 


x 2\ -—(m+1)/2 
(8) F(2) = Kn | (1 4: <) du 


m 
of the ¢-distribution. Here, m (= 1, 2,---) is a parameter, called the number of degrees 
of freedom of the distribution (abbreviated d.f.). In the present case, m = n — 1; see 
Table 25.2. The constant K,, is such that F(%) = 1. By integration it turns out that 
Km = TG m+ 3)/[ Vnitr T(m)], where I is the gamma function (see (24) in App. A3.1). 


Table 25.2 Determination of a Confidence Interval for the Mean pu 
of a Normal Distribution with Unknown Variance a” 


Step 1. Choose a confidence level y (95%, 99%, or the like). 


Step 2. Determine the solution c of the equation 


(9) F(e) =3(1 + y) 


from the table of the ft-distribution with n — 1 degrees of freedom 
(Table A9 in App. 5; or use a CAS; n = sample size). 
Step 3. Compute the mean X and the variance s” of the sample x, °° +, Xp. 


Step 4. Compute k = cs/Vn. The confidence interval is 


(10) CONF, {¥ —kSpSi+hk). 
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Figure 528 compares the curve of the density of the f-distribution with that of the normal 
distribution. The latter is steeper. This illustrates that Table 25.1 (which uses more 
information, namely, the known value of o”) yields shorter confidence intervals than 
Table 25.2. This is confirmed in Fig. 529, which also gives an idea of the gain by increasing 
the sample size. 


Fig. 527. Distribution functions of the Fig. 528. Densities of the t-distribution 
t-distribution with 1 and 3 df. and of the with 1 and 3 d.f. and of the standardized 
standardized normal distribution (steepest curve) normal distribution 
2 


LIL 1.5- 


0 10 » 20 


Fig. 529. Ratio of the lengths L’ and L of the confidence 
intervals (10) and (3) with y = 95% and y = 99% as a function 
of the sample size n for equal s and o 


Confidence Interval for 4 of the Normal Distribution with Unknown o” 


Five independent measurements of the point of inflammation (flash point) of Diesel oil (D-2) gave the values 
(in°F) 144 147 146 142 144. Assuming normality, determine a 99% confidence interval for the mean. 


Solution. Step 1. y = 0.99 is required. 

Step 2. F(c) = 3(1 + y) = 0.995, and Table A9 in App. 5 with n — 1 = 4 d-f. gives c = 4.60. 
Step 3. ¥ = 144.6, s? = 3.8. 

Step 4. k = V3.8 + 4.60/V5 = 4.01. The confidence interval is CONFo.99 {140.5 S w S 148.7}. 


If the variance 07 were known and equal to the sample variance 2, thus o? = 3.8, then Table 25.1 would 
give k = co/Vn = 2.576V3.8/V5 = 2.25 and CONF 99 {142.35 = w S 146.85}. We see that the present 
interval is almost twice as long as that obtained from Table 25.1 (with 7? = 3.8). Hence for small samples the 
difference is considerable! See also Fig. 529. a 
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THEOREM 2 


Theory for Table 25.2. For deriving (10) in Table 25.2 we need from Ref. [G3] 


Student’s t-Distribution 


Let X1,-++, Xp, be independent normal random variables with the same mean jt and 
the same variance a”. Then the random variable 


a 


11 T= 
(11) TE 


has a t-distribution [see (8)] with n — | degrees of freedom (d.f.); here X is given 
by (4) and 


(12) Sa = 


Derivation of (10). This is similar to the derivation of (3). We choose a number y 
between 0 and | and determine a number c from Table A9 in App. 5 with n — | d.f. (or 
from a CAS) such that 


(13) P(-c ST Sc) = F(c) — F(-o) = y. 


Since the f-distribution is symmetric, we have 
F(—c) = 1 — Fo), 


and (13) assumes the form (9). Substituting (11) into (13) and transforming the result as 
before, we obtain 


(14) PX -KSySX+K=y 
where 
K = cS/Vn. 


By inserting the observed values x of X and s” of S? into (14) we finally obtain (10). & 


Confidence Interval for the Variance a7 


of the Normal Distribution 
Table 25.3 shows the steps, which are similar to those in Tables 25.1 and 25.2. 
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Table 25.3 Determination of a Confidence Interval for the Variance 
o? of a Normal Distribution, Whose Mean Need Not Be Known 


Step 1. Choose a confidence level y (95%, 99%, or the like). 


Step 2. Determine solutions cy and cz of the equations 


(15) F(e1) = 3(1 — 9), F(a) =3(1 + y) 


from the table of the chi-square distribution with n — | degrees of 
freedom (Table A10 in App. 5; or use a CAS; n = sample size). 


Step 3. Compute (n — 1)s”, where s? is the variance of the sample 
X47 Xn 

Step 4. Compute ky = (n — 1)s?/cy and ky = (n — 1)s”/cp, The 
confidence interval is 


(16) CONF, {k = 0” 


IIA 


ky}. 


Confidence Interval for the Variance of the Normal Distribution 


Determine a 95% confidence interval (16) for the variance, using Table 25.3 and a sample (tensile strength of 
sheet steel in kg/ mm?, rounded to integer values) 


89 84 87 81 89 86 91 90 78 89 87 99 83 89. 


Solution. Step 1. y = 0.95 is required. 
Step 2. Forn — 1 = 13 we find 
cy = 5.01 and cp = 24.74. 
Step 3. 13s” = 326.9. 
Step 4. 13s7/c, = 65.25, 138?/cg = 13.21. 
The confidence interval is 
CONF 95 {13.21 S 07 S 65.25}. 


This is rather large, and for obtaining a more precise result, one would need a much larger sample. BH 


Theory for Table 25.3. In Table 25.1 we used the normal distribution, in Table 25.2 
the ¢-distribution, and now we shall use the X-distribution (chi-square distribution), 
whose distribution function is F(z) = 0 if z < 0 and 


= 
F(2) = Cn | ey? ay if 20 (Fig. 530). 
0 


The parameter m (= 1, 2,---) is called the number of degrees of freedom (d_-f.), and 
Cm = 1/12"? Em). 


Note that the distribution is not symmetric (see also Fig. 531). 
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For deriving (16) in Table 25.3 we need the following theorem. 


0 2 4 6 8 10 x 
Fig. 530. Distribution function of the chi-square distribution with 2, 3, 5 df. 


THEOREM 3 Chi-Square Distribution 


Under the assumptions in Theorem 2 the random variable 


2 
(17) Y=(n- yp 
Oo 


with S” given by (12) has a chi-square distribution with n — | degrees of freedom. 


Proof in Ref. [G3], listed in App. 1. 
y 


0.5 
0.4 
0.3 
0.2 


0.1 


fj 


0) 2 4 6 8 10 x 


Fig. 531. Density of the chi-square distribution with 2, 3, 5 df. 


Derivation of (16). This is similar to the derivation of (3) and (10). We choose a number 
y between 0 and | and determine c; and cz from Table A10, App. 5, such that [see (15)] 


PYSa)=Fa)=30-y), PY Sco) = Fc) =9(1 + 7). 
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Subtraction yields 


P(cy S Y Se) = PY Sc) — PY S c1) = Fl(c2) F(c1) = y. 


Transforming cy = Y S cg with Y given by (17) into an inequality for o”, we obtain 


1 


Sasers eet: 
c2 Cy 


n—- 1 


s* 
By inserting the observed value s” of S? we obtain (16). 3 


Confidence Intervals for Parameters 
of Other Distributions 


The methods in Tables 25.1—25.3 for confidence intervals for 4 and o” are designed for 
the normal distribution. We now show that they can also be applied to other distributions 
if we use large samples. 

We know that if X1,---, X, are independent random variables with the same mean ju 
and the same variance o”, then their sum Y,, = X; + --- + X,, has the following properties. 


(A) ¥, has the mean ny and the variance no (by Theorems | and 3 in Sec. 24.9). 
(B) If those variables are normal, then ¥, is normal (by Theorem 1). 


If those random variables are not normal, then (B) is not applicable. However, for large 
n the random variable Y,, is still approximately normal. This follows from the central limit 
theorem, which is one of the most fundamental results in probability theory. 


Central Limit Theorem 


Let Xy,°-++, Xn,°++ be independent random variables that have the same distribution 
function and therefore the same mean tt and the same variance ao”. Let 
Y, = X1 + +++ + Xp. Then the random variable 


iy =u 


18 LZy = 
(18) PU. 


is asymptotically normal with mean 0 and variance 1; that is, the distribution 
function F(x) of Z,, satisfies 


lim F,(x) = ®(x) = = | eo dy. 
n—2 a/ 7 


0 


A proof can be found in Ref. [G3] listed in App. 1. 


Hence, when applying Tables 25.1—25.3 to a nonnormal distribution, we must use 
sufficiently large samples. As a rule of thumb, if the sample indicates that the skewness 
of the distribution (the asymmetry; see Team Project 20(d), Problem Set 24.6) is small, 
use at least n = 20 for the mean and at least n = 50 for the variance. 
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1. 


Why are interval estimates generally more useful than 
point estimates? 


2-6 


MEAN (VARIANCE KNOWN) 


2. 


Find a 95% confidence interval for the mean of a 
normal population with standard deviation 4.00 from 
the sample 39, 51, 49, 43, 57, 59. Does that interval 
get longer or shorter if we take y = 0.99 instead of 
0.95? By what factor? 


. By what factor does the length of the interval in Prob. 2 


change if we double the sample size? 


. Determine a 95% confidence interval for the mean pw 


of a normal population with variance 0? = 16, using 
a sample of size 200 with mean 74.81. 


. What sample size would be needed for obtaining a 95 % 


confidence interval (3) of length 20? Of length a? 


. What sample size is needed to obtain a 99% confidence 


interval of length 2.0 for the mean of a normal population 
with variance 25? Use Fig. 526. Check by calculation. 


MEAN (VARIANCE UNKNOWN) 


7. 


Find a 95% confidence interval for the percentage of 
cars on a certain highway that have poorly adjusted 
brakes, using a random sample of 800 cars stopped at 
a roadblock on that highway, 126 of which had poorly 
adjusted brakes. 


. K. Pearson result. Find a 99% confidence interval for 


p in the binomial distribution from a classical result by 
K. Pearson, who in 24,000 trials of tossing a coin obtained 
12,012 Heads. Do you think that the coin was fair? 


9-11 


Find a 99% confidence interval for the mean of 


a normal population from the sample: 


9. 


10. 
11. 


25.4 Testing of Hypotheses. 


Copper content (%) of brass 66, 66, 65, 64, 66, 67, 64, 
65, 63, 64 
Melting point (°C) of aluminum 660, 667, 654, 663, 662 


Knoop hardness of diamond 9500, 9800, 9750, 9200, 
9400, 9550 


12. 


CAS EXPERIMENT. Confidence Intervals. Obtain 
100 samples of size 10 of the standardized normal 
distribution. Calculate from them and graph the 
corresponding 95% confidence intervals for the mean 
and count how many of them do not contain 0. Does 
the result support the theory? Repeat the whole 
experiment, compare and comment. 


13-17 


VARIANCE 


Find a 95% confidence interval for the variance of a normal 
population from the sample: 


13. 


14. 


15. 


16. 


17. 
18. 


19. 


20. 


Length of 20 bolts with sample mean 20.2 cm and 
sample variance 0.04 cm” 


Carbon monoxide emission (grams per mile) of a 
certain type of passenger car (cruising at 55 mph): 17.3, 
17.8, 18.0, 17.7, 18.2, 17.4, 17.6, 18.1 


Mean energy (keV) of delayed neutron group (Group 3, 
half-life 6.2 s) for uranium U?*? fission: a sample of 
100 values with mean 442.5 and variance 9.3 


Ultimate tensile strength (k psi) of alloy steel 
(Maraging H) at room temperature: 251, 255, 258, 253, 
253, 252, 250, 252, 255, 256 


The sample in Prob. 9 


If X; and X5 are independent normal random variables 
with mean 14 and 8 and variance 2 and 5, respectively, 
what distribution does 3 X; — Xo have? Hint. Use Team 
Project 14(g) in Sec. 24.8. 


A machine fills boxes weighing Y Ib with X lb of salt, 
where X and Y are normal with mean 100 Ib and 5 |b 
and standard deviation 1 Ib and 0.5 lb, respectively. 
What percent of filled boxes weighing between 104 Ib 
and 106 lb are to be expected? 


If the weight X of bags of cement is normally 
distributed with a mean of 40 kg and a standard 
deviation of 2 kg, how many bags can a delivery truck 
carry so that the probability of the total load exceeding 
2000 kg will be 5%? 


Decisions 


The ideas of confidence intervals and of tests” are the two most important ideas in modern 
statistics. In a statistical test we make inference from sample to population through testing a 
hypothesis, resulting from experience or observations, from a theory or a quality requirement, 
and so on. In many cases the result of a test is used as a basis for a decision, for instance, to 


Beginning around 1930, a systematic theory of tests was developed by NEYMAN (see Sec. 25.3) and EGON 
SHARPE PEARSON (1895-1980), English statistician, the son of Karl Pearson (see the footnote on p. 1086). 
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buy (or not to buy) a certain model of car, depending on a test of the fuel efficiency (miles/gal) 
(and other tests, of course), to apply some medication, depending on a test of its effect; to 
proceed with a marketing strategy, depending on a test of consumer reactions, etc. 

Let us explain such a test in terms of a typical example and introduce the corresponding 
standard notions of statistical testing. 


Test of a Hypothesis. Alternative. Significance Level a 


We want to buy 100 coils of a certain kind of wire, provided we can verify the manufacturer’s claim that the 
wire has a breaking limit 4p = (4p = 200 |b (or more). This is a test of the hypothesis (also called null hypothesis) 
/ = Mo = 200. We shall not buy the wire if the (statistical) test shows that actually uw = fy < po, the wire is 
weaker, the claim does not hold. 1, is called the alternative (or alternative hypothesis) of the test. We shall 
accept the hypothesis if the test suggests that it is true, except for a small error probability a, called the 
significance level of the test. Otherwise we reject the hypothesis. Hence a is the probability of rejecting a 
hypothesis although it is true. The choice of a is up to us. 5% and 1% are popular values. 

For the test we need a sample. We randomly select 25 coils of the wire, cut a piece from each coil, and 
determine the breaking limit experimentally. Suppose that this sample of n = 25 values of the breaking limit 
has the mean x = 197 Ib (somewhat less than the claim!) and the standard deviation s = 6 lb. 

At this point we could only speculate whether this difference 197 — 200 = —3 is due to randomness, is a 
chance effect, or whether it is significant, due to the actually inferior quality of the wire. To continue beyond 
speculation requires probability theory, as follows. 

We assume that the breaking limit is normally distributed. (This assumption could be tested by the method 
in Sec. 25.7. Or we could remember the central limit theorem (Sec. 25.3) and take a still larger sample.) Then 


_X> Ho 
S/Vn 


in (11), Sec. 25.3, with np = po has at-distribution with n — | degrees of freedom (n — 1 = 24 for our sample). 
Also x = 197 and s = 6 are observed values of X and S to be used later. We can now choose a significance 
level, say, a = 5%. From Table A9 in App. 5 or from a CAS we then obtain a critical value c such that 
PT Sc) =a =5%. For (TSC) = 1—-—a=95% the table gives ¢ = 1.71, so that c= —¢ = —-1L.71 
because of the symmetry of the distribution (Fig. 532). 

We now reason as follows—this is the crucial idea of the test. If the hypothesis is true, we have a chance 
of only a (= 5%) that we observe a value t of T (calculated from a sample) that will fall between —°% and 
—1.71. Hence, if we nevertheless do observe such a ¢, we assert that the hypothesis cannot be true and we reject 
it. Then we accept the alternative. If, however, t 2 c, we accept the hypothesis. 

A simple calculation finally gives t = (197 — 200)/(6/V/25) = —2.5 as an observed value of T. Since 
—2.5 < —1.71, we reject the hypothesis (the manufacturer’s claim) and accept the alternative w = uy < 200, 
the wire seems to be weaker than claimed. 


Reject hypothesis Do not reject hypothesis 


e=-1.71 0) t 
Fig. 532. t-distribution in Example 1 


This example illustrates the steps of a test: 


1. Formulate the hypothesis 6 = 6 to be tested. (899 = fo in the example.) 
2. Formulate an alternative 0 = 6. (6, = p41 in the example.) 
3. Choose a significance level a (5%, 1%, 0.1%). 


4. Use a random variable 6 = g(Xy,°-+,X,) whose distribution depends on the 
hypothesis and on the alternative, and this distribution is known in both cases. Determine 
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a critical value c from the distribution of 6, assuming the hypothesis to be true. (In the 
example, O = T, and c is, obtained from P(T S c) = a.) 

5. Use a sample x1,---,x,, to determine an observed value 6 = 8(X1,°°*, Xn) Of 6. 
(t in the example.) 


6. Accept or reject the hypothesis, depending on the size of 6 relative to c. (t< cin 
the example, rejection of the hypothesis.) 


Two important facts require further discussion and careful attention. The first is the 
choice of an alternative. In the example, 41 < mo, but other applications may require 
1 > Lo Or 1 # Mo. The second fact has to do with errors. We know that a (the 
significance level of the test) is the probability of rejecting a true hypothesis. And we 
shall discuss the probability B of accepting a false hypothesis. 


One-Sided and Two-Sided Alternatives (Fig. 533) 


Let 6 be an unknown parameter in a distribution, and suppose that we want to test the 
hypothesis 9 = 69. Then there are three main kinds of alternatives, namely, 


(1) 80> 00 
(2) 8 < 49 
(3) 0 # 8. 


(1) and (2) are one-sided alternatives, and (3) is a two-sided alternative. 

We call rejection region (or critical region) the region such that we reject the 
hypothesis if the observed value in the test falls in this region. In @ the critical c lies to 
the right of 69 because so does the alternative. Hence the rejection region extends to 
the right. This is called a right-sided test. In @) the critical c lies to the left of 69 (as 
in Example 1), the rejection region extends to the left, and we have a left-sided test 
(Fig. 533, middle part). These are one-sided tests. In @) we have two rejection regions. 
This is called a two-sided test (Fig. 533, lower part). 


Acceptance Region Rejection Region 
Do not reject hypothesis (Critical Region) 
(Accept hypothesis) Reject hypothesis 
@ , 
9% 
¢ 
Rejection Region Acceptance Region 
(Critical Region) Do not reject hypothesis 
Reject hypothesis (Accept hypothesis) 


@ 4 


0 
td 


Acceptance Region 


Rejection Region Do not reject Rejection Region 
(Critical Region) hypothesis (Critical Region) 
Reject hypothesis (Accept hypothesis) Reject hypothesis 
® , 
9 
cy cy 


Fig. 533. Test in the case of alternative (1) (upper part of the figure), alternative 
(2) (middle part), and alternative (3) 
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All three kinds of alternatives occur in practical problems. For example, (1) may arise 
if 69 is the maximum tolerable inaccuracy of a voltmeter or some other instrument. 
Alternative (2) may occur in testing strength of material, as in Example 1. Finally, 09 in 
(3) may be the diameter of axle-shafts, and shafts that are too thin or too thick are equally 
undesirable, so that we have to watch for deviations in both directions. 


Errors in Tests 
Tests always involve risks of making false decisions: 


(I) Rejecting a true hypothesis (Type I error). 
a = Probability of making a Type I error. 


(II) Accepting a false hypothesis (Type II error). 
B = Probability of making a Type II error. 


Clearly, we cannot avoid these errors because no absolutely certain conclusions about 
populations can be drawn from samples. But we show that there are ways and means of 
choosing suitable levels of risks, that is, of values @ and B. The choice of a depends on the 
nature of the problem (e.g., a small risk a = 1% is used if it is a matter of life or death). 

Let us discuss this systematically for a test of a hypothesis 6 = 09 against an alternative 
that is a single number 61, for simplicity. We let 0; > 69, so that we have a right-sided 
test. For a left-sided or a two-sided test the discussion is quite similar. 

We choose a critical c > 09 (as in the upper part of Fig. 533, by methods discussed 
below). From a given sample x4,---,x, we then compute a value 


6= 8(X1,°°* Xn) 
with a suitable g (whose choice will be a main point of our further discussion; for instance, 
take g = (vy + -:: + x,)/n in the case in which 6 is the mean). If 6 > c, we reject the 


hypothesis. If 6 <c, we accept it. Here, the value 6 can be regarded as an observed value 
of the random variable 


(4) 8 = g(%,°"*, Xn) 


because x; may be regarded as an observed value of Xj, 7 = 1,---,n. In this test there are 
two possibilities of making an error, as follows. 


Type I Error (see Table 25.4). The hypothesis is true but is rejected (hence the 


alternative is accepted) because O assumes a value 6>c. Obviously, the probability of 
making such an error equals 


(5) P(O = C)o=0 Oba 


a is called the significance level of the test, as mentioned before. 


Type II Error (see Table 25.4). The hypothesis is false but is accepted because ) 
assumes a value 0 = c. The probability of making such an error is denoted by B; thus 


(6) POS chon, = B. 
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EXAMPLE 2 


7 = 1 — B is called the power of the test. Obviously, the power 7 is the probability of 
avoiding a Type II error. 


Table 25.4 Type land Type II Errors in Testing a Hypothesis 
0 = 0, Against an Alternative 0 = 6, 


Unknown Truth 
0 = 0 0=60, 
a True decision Type II error 
o = = _ = 
5 0 = 00 P=1-a P=8B 
3 Type | error True decision 
0=6, P=a P=1-f8 


Formulas (5) and (6) show that both a and 6 depend on c, and we would like to choose 
c so that these probabilities of making errors are as small as possible. But the important 
Figure 534 shows that these are conflicting requirements because to let a decrease we must 
shift c to the right, but then 6 increases. In practice we first choose a (5%, sometimes 1%), 
then determine c, and finally compute f. If 6 is large so that the power n = 1 — B is small, 
we should repeat the test, choosing a larger sample, for reasons that will appear shortly. 


Density of fe) if 
the hypothesis 
is true 


Density of fe) if 
Lo the alternative 


=- | _ 
27" TSS. is true 
S 


Acceptance region —>~<— Rejection region (Critical region) 


Fig. 534. Illustration of Type | and II errors in testing a hypothesis 
0 = 6, against an alternative 6 = 6, (> Oo, right-sided test) 


If the alternative is not a single number but is of the form (1)—-(3), then B becomes a 
function of 6. This function B(@) is called the operating characteristic (OC) of the test 
and its curve the OC curve. Clearly, in this case 7 = | — B also depends on @. This 
function 7(@) is called the power function of the test. (Examples will follow.) 

Of course, from a test that leads to the acceptance of a certain hypothesis 69, it does 
not follow that this is the only possible hypothesis or the best possible hypothesis. Hence 
the terms “not reject” or “fail to reject” are perhaps better than the term “accept.” 


Test for jz of the Normal Distribution with Known o7 


The following example explains the three kinds of hypotheses. 


Test for the Mean of the Normal Distribution with Known Variance 


Let X be a normal random variable with variance 0? = 9. Using a sample of size n = 10 with mean X, test the 
hypothesis 4 = fo = 24 against the three kinds of alternatives, namely, 


(a) b> Ho (b) &@< po (c) @# Ho. 
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Solution. We choose the significance level a = 0.05. An estimate of the mean will be obtained from 
= 
X= Pact ee ae, em) Fe 

If the hypothesis is true, X is normal with mean pw = 24 and variance o?/ n = 0.9, see Theorem 1, Sec. 25.3. 


Hence we may obtain the critical value c from Table A8 in App. 5. 


Case (a). Right-Sided Test. We determine c from P(X > C)y=24 = a = 0.05, that is, 


= c— 24 
PX = c),=0a o( ) l-a=0.95. 
V0.9 


Table A8 in App. 5 gives (c — 24)/V0.9 = 1.645, and c = 25.56, which is greater than jo, as in the upper 
part of Fig. 533. If x S 25.56, the hypothesis is accepted. If x > 25.56, it is rejected. The power function of the 
test is (Fig. 535) 


ERR ieapaiaeel Pam te l ! 1 


1 
20 22 Ho 26 28 


Fig. 535. Power function (jw) in Example 2, case (a) (dashed) and case (c) 


ne) = P(X > 25.56), = 1 — P(X = 25.56), 


7) 25.56 — w 
1-©® 
( V0.9 


) 1 — 0(26.94 — 1.05) 


Case (b). Left-Sided Test. The critical value c is obtained from the equation 


c— 24 
V0.9 


P(X Sc), =24 o( ) a = 0.05. 


Table A8 in App. 5 yields c = 24 — 1.56 = 22.44. If x 2 22.44, we accept the hypothesis. If x < 22.44, we 
reject it. The power function of the test is 


22.44 — w 
V0.9 


(8) nw) = P(X = 22.44), = o( ) (23.65 — 1.05). 


Case (c). Two-Sided Test. Since the normal distribution is symmetric, we choose c, and cg equidistant from 
pe = 24, say, cy = 24 — k and cg = 24 + k, and determine k from 


_ k k 
P24 —-kSX 524+ by-04 o( ) o( ) 1—a=0.95. 
V0.9 V0.9 
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EXAMPLE 3 


Table A8 in App. 5 gives k/V/0.9 = 1.960, hence k = 1.86. This gives the values cy = 24 — 1.86 = 22.14 and 
co = 24 + 1.86 = 25.86. If x is not smaller than c, and not greater than cy, we accept the hypothesis. Otherwise 
we reject it. The power function of the test is (Fig. 535) 


n() = P(X < 22.14), + P(X > 25.86), = P(X < 22.14), + 1 — P(X S 25.86), 


0) , 0o( 24 = #) 0o( 26 = #) 
V0.9 V0.9 


= 1+ ©(23.34 — 1.05) — (27.26-1.05p). 


Consequently, the operating characteristic B(u) = 1 — n(n) (see before) is (Fig. 536) 
B(w) = (27.26 — 1.05) — (23.34 — 1.05). 


If we take a larger sample, say, of size n = 100 (instead of 10), then o7/n = 0.09 (instead of 0.9) and the 
critical values are cy = 23.41 and cg = 24.59, as can be readily verified. Then the operating characteristic of 
the test is 


24.59 — 23.41 — 
Blu) o( #) o( #) 
V0.09 V0.09 
= (81.97 — 3.334) — 0(78.03 — 3.33y). 


Figure 536 shows that the corresponding OC curve is steeper than that for n = 10. This means that the increase 
of n has led to an improvement of the test. In any practical case, n is chosen as small as possible but so 
large that the test brings out deviations between y and po that are of practical interest. For instance, if 
deviations of +2 units are of interest, we see from Fig. 536 that n = 10 is much too small because when 
p= 24 — 2 = 22 0rp = 24 + 2 = 26 B is almost 50%. On the other hand, we see that n = 100 is sufficient 
for that purpose. 3] 


20 22 Lo 26 28 


Fig. 536. Curves of the operating characteristic (OC curves) in 
Example 2, case (c), for two different sample sizes n 


Test for When a7 Is Unknown, and for o7 


Test for the Mean of the Normal Distribution with Unknown Variance 


The tensile strength of a sample of n = 16 manila ropes (diameter 3 in.) was measured. The sample mean was 
x = 4482 kg, and the sample standard deviation was s = 115 kg (N. C. Wiley, 41st Annual Meeting of the 
American Society for Testing Materials). Assuming that the tensile strength is a normal random variable, test 
the hypothesis 49 = 4500 kg against the alternative w,; = 4400 kg. Here uo may be a value given by the 
manufacturer, while 41 may result from previous experience. 
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Solution. We choose the significance level a = 5%. If the hypothesis is true, it follows from Theorem 2 
in Sec. 25.3, that the random variable 

X— po  X — 4500 

S/Vn 5/4 


has a t-distribution with n — 1 = 15 df. The test is left-sided. The critical value c is obtained from 
P(T <c),, = @ = 0.05. Table A9 in App. 5 gives c = —1.75. As an observed value of T we obtain from the 
sample t = (4482 — 4500)/(115/4) = —0.626. We see that ¢ > c and accept the hypothesis. For obtaining 
numeric values of the power of the test, we would need tables called noncentral Student t-tables; we shall not 
discuss this question here. @ 


Test for the Variance of the Normal Distribution 


Using a sample of size n = 15 and sample variance s” = 13 from a normal population, test the hypothesis 


= on = 10 against the alternative o= oF = 20. 


Solution. We choose the significance level a = 5%. If the hypothesis is true, then 


S2 is? 
Y= (@— 1) 14 1.48? 
of 10 


has a chi-square distribution with n — 1 = 14 df. by Theorem 3, Sec. 25.3. From 
P(Y > c) = a = 0.05, that is, P(Y Sc) = 0.95, 


and Table A10 in App. 5 with 14 degrees of freedom we obtain c = 23.68. This is the critical value of Y. Hence 
tos? = oeY/(n — 1) = 0.714Y there corresponds the critical value c* = 0.714 - 23.68 = 16.91. Since s*<c%, 
we accept the hypothesis. 

If the alternative is true, the random variable % = 1482/ of = 0.782 has a chi-square distribution with 14 
d.f. Hence our test has the power 


n = P(S? > c*),2~20 = PO > 0.7c*)2-20 = 1 — PO = 11.84),2-20. 


From a more extensive table of the chi-square distribution (e.g. in Ref. [G3] or [G8]) or from your CAS, you 
see that 1 ~ 62%. Hence the Type II risk is very large, namely, 38%. To make this risk smaller, we would 
have to increase the sample size. a] 


Comparison of Means and Variances 


Comparison of the Means of Two Normal Distributions 


Using a sample x1,-++, x», from a normal distribution with unknown mean jz and a sample yy,---, Yn, from 

another normal distribution with unknown mean y,, we want to test the hypothesis that the means are equal, 

My = My, against an alternative, say, {4 > fy. The variances need not be known but are assumed to be equal.? 
Two cases of comparing means are of practical importance: 


Case A. The samples have the same size. Furthermore, each value of the first sample corresponds to precisely 
one value of the other, because corresponding values result from the same person or thing (paired comparison) — 
for example, two measurements of the same thing by two different methods or two measurements from the two 
eyes of the same person. More generally, they may result from pairs of similar individuals or things, for example, 
identical twins, pairs of used front tires from the same car, etc. Then we should form the differences of 
corresponding values and test the hypothesis that the population corresponding to the differences has mean 0, 
using the method in Example 3. If we have a choice, this method is better than the following. 


3This assumption of equality of variances can be tested, as shown in the next example. If the test shows that 
they differ significantly, choose two samples of the same size ny = ng = n (not too small, > 30, say), use the 
test in Example 2 together with the fact that (12) is an observed value of an approximately standardized normal 
random variable. 
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Case B. The two samples are independent and not necessarily of the same size. Then we may proceed 
as follows. Suppose that the alternative is wz, > fy. We choose a significance level a. Then we compute the 
sample means x and y as well as (ny — 1)s2 and (ng — 1s? where s2 and sz are the sample variances. Using 
Table A9 in App. 5 with ny + ng — 2 degrees of freedom, we now determine c from 


(10) P(TSc)=1-a. 


We finally compute 


(1) to = _ [Matar + ng — 2) x—y : 
Ny + Ne Vin — Vs? + (ng - Ds? 


It can be shown that this is an observed value of a random variable that has a f-distribution with ny + ng — 2 
degrees of freedom, provided the hypothesis is true. If tg = c, the hypothesis is accepted. If to > c, it is rejected. 
If the alternative is fz, # My, then (10) must be replaced by 


(10*) P(T S cy) = 0.5a, P(T Sco) = 1 — 05a. 
Note that for samples of equal size ny = ng = n, formula (11) reduces to 


Va re 
(12) to = Vn ——. 
Vs2 +s? 


To illustrate the computations, let us consider the two samples (x1,-+-, X»,) and (y1,°+*, Yn,) given by 


105 108 86 103 103 107 124 105 
and 


89 92 84 97 103 107 111 97 


showing the relative output of tin plate workers under two different working conditions [J. J. B. Worth, Journal 
of Industrial Engineering 9, 249-253). Assuming that the corresponding populations are normal and have the 
same variance, let us test the hypothesis wz = jy against the alternative uw, # My. (Equality of variances will 
be tested in the next example.) 


Solution. We find 
x= 105.125, y=97.500, s2= 106.125.  s7 = 84.000. 


We choose the significance level a = 5%. From (10*) with 0.5a@ = 2.5%, 1 — 0.5a@ = 97.5% and Table A9 
in App. 5 with 14 degrees of freedom we obtain cy = —2.14 and cg = 2.14. Formula (12) with n = 8 gives the 
value 


to = V8 + 7.625/V/190.125 = 1.56. 


Since cy S to = cg, we accept the hypothesis jz = fy that under both conditions the mean output is the same. 
Case A applies to the example because the two first sample values correspond to a certain type of work, the 
next two were obtained in another kind of work, etc. So we may use the differences 


16 16 2 6 0 0 13 8 


of corresponding sample values and the method in Example 3 to test the hypothesis w = 0, where pw is the mean 
of the population corresponding to the differences. As a logical alternative we take 4 # 0. The sample mean is 
d = 7.625, and the sample variance is s? = 45.696. Hence 


t = V8 (7.625 — 0)/V45.696 = 3.19. 


From P(T S cy) = 2.5%, P(T S cg) = 97.5% and Table A9 in App. 5 with n — 1 = 7 degrees of freedom we 
obtain cy = —2.36, co = 2.36 and reject the hypothesis because t = 3.19 does not lie between c and cy. Hence 
our present test, in which we used more information (but the same samples), shows that the difference in output 
is significant. lei] 
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Comparison of the Variance of Two Normal Distributions 


Using the two samples in the last example, test the hypothesis 02 = on; assume that the corresponding 
populations are normal and the nature of the experiment suggests the alternative o> oy. 


Solution. We find s? = 106.125, 8 = 84.000. We choose the significance level a = 5%. Using 
P(V Sc) = 1-— a= 95% and Table All in App. 5, with (ny — 1, ng — 1) = (7,7) degrees of freedom, we 
determine c = 3.79. We finally compute v9 = s2/s2 = 1.26. Since vp S c, we accept the hypothesis. If vg < c, 
we would reject it. 

This test is justified by the fact that vg is an observed value of a random variable that has a so-called 
F-distribution with (n, — 1,2 — 1) degrees of freedom, provided the hypothesis is true. (Proof in Ref. [G3] 
listed in App. 1.) The F-distribution with (m, n) degrees of freedom was introduced by R. A. Fisher* and has 
the distribution function F(z) = 0 if z < 0 and 


2 
(13) F(z) = Kin | £6 — 2/2 mt + nyt? at (z = 0), 


0 


where Kin = m”?n" Thm + §n)/T4mlGn). (For I see App. A3.1.) ial 


This long section contained the basic ideas and concepts of testing, along with typical 
applications and you may perhaps want to review it quickly before going on, because the 
next sections concern an adaptation of these ideas to tasks of great practical importance 
and resulting tests in connection with quality control, acceptance (or rejection) of goods 


produced, and so on. 


PROBLEM SET 25-4 


1. 


2. 


test the hypothesis 4 = 60.0 against the alternative 
mb = 57.0 using a sample of size 20 with mean x = 58.50 
and choosing a = 5%. 

How does the result in Prob. 6 change if we use a small- 
er sample, say, of size 5, the other data (x = 58.05, 
a = 5%, etc.) remaining as before? 


From memory: Make a list of the three types of 8. Determine the power of the test in Prob. 6. 
alternatives, each with a typical example of your own. 9. What is the rejection region in Prob. 6 in the case of a 
Make a list of methods in this section, each with the two-sided test with a = 5%? 
distribution needed in testing. 10. CAS EXPERIMENT. Tests of Means and Variances. 
. Test w = Oagainst w > 0, assuming normality and (a) Obtain 100 samples of size 10 each from the normal 
using the sample 0, 1, —1, 3, —8, 6, 1 (deviations of the distribution with mean 100 and variance 25. For each 
azimuth [multiples of 0.01 radian] in some revolution sample, test the hypothesis go = 100 against the 
of a satellite). Choose a = 5%. alternative 41 > 100 at the level of a = 10%. Record 
In one of his classical experiments Buffon obtained 2048 the number of rejections of the hypothesis. Do the whole 
heads in tossing a coin 4040 times. Was the coin fair? experiment once more and compare. 
Do the same test as in Prob. 4, using a result by K. (b) Set up a similar experiment for the variance of a 
Pearson, who obtained 6019 heads in 12,000 trials. normal distribution and perform it 100 times. 
Assuming normality and known variance o” = 9, 11. A firm sells oil in cans containing 5000 g oil per can 


and is interested to know whether the mean weight 
differs significantly from 5000 g at the 5% level, in 
which case the filling machine has to be adjusted. Set 
up a hypothesis and an alternative and perform the test, 
assuming normality and using a sample of 50 fillings 
with mean 4990 g and standard deviation 20 g. 


* After the pioneering work of the English statistician and biologist, KARL PEARSON (1857-1936), the 


founder of the English school of statistics, and WILLIAM SEALY GOSSET (1876-1937), who discovered the 
t-distribution (and published under the name “Student”), the English statistician Sir RONALD AYLMER 
FISHER (1890-1962), professor of eugenics in London (1933-1943) and professor of genetics in Cambridge, 
England (1943-1957) and Adelaide, Australia (1957-1962), had great influence on the further development of 
modern statistics. 
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12. 


13. 


14. 


15. 


16. 


If a sample of 25 tires of a certain kind has a mean life 
of 37,000 miles and a standard deviation of 5000 miles, 
can the manufacturer claim that the true mean life of 
such tires is greater than 35,000 miles? Set up and test 
a corresponding hypothesis at the 5% level, assuming 
normality. 


If simultaneous measurements of electric voltage by 
two different types of voltmeter yield the differences 
(in volts) 0.4, —0.6, 0.2, 0.0, 1.0, 1.4, 0.4, 1.6, can we 
assert at the 5% level that there is no significant 
difference in the calibration of the two types of 
instruments? Assume normality. 


If a standard medication cures about 75% of patients 
with a certain disease and a new medication cured 310 
of the first 400 patients on whom it was tried, can we 
conclude that the new medication is better? Choose 
a = 5%. First guess. Then calculate. 


Suppose that in the past the standard deviation of 
weights of certain 100.0-oz packages filled by a 
machine was 0.8 oz. Test the hypothesis Hp: a = 0.8 
against the alternative H,:0 > 0.8 (an undesirable 
increase), using a sample of 20 packages with standard 
deviation 1.0 oz and assuming normality. Choose 
a=5%. 

Suppose that in operating battery-powered electrical 
equipment, it is less expensive to replace all batter- 
ies at fixed intervals than to replace each battery 
individually when it breaks down, provided the 
standard deviation of the lifetime is less than a certain 


17. 


18. 


19. 


20. 
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limit, say, less than 5 hours. Set up and apply a suitable 
test, using a sample of 28 values of lifetimes with 
standard deviation s = 3.5 hours and assuming 
normality: choose a = 5%. 


Brand A gasoline was used in 16 similar automobiles 
under identical conditions. The corresponding sample 
of 16 values (miles per gallon) had mean 19.6 and 
standard deviation 0.4. Under the same conditions, 
high-power brand B gasoline gave a sample of 16 
values with mean 20.2 and standard deviation 0.6. Is 
the mileage of B significantly better than that of A? 
Test at the 5% level; assume normality. First guess. 
Then calculate. 


The two samples 70, 80, 30, 70, 60, 80 and 140, 120, 
130, 120, 120, 130, 120 are values of the differences of 
temperatures (°C) of iron at two stages of casting, taken 
from two different crucibles. Is the variance of the first 
population larger than that of the second? Assume 
normality. Choose a = 5%. 

Show that for a normal distribution the two types of 
errors in a test of a hypothesis Hp: uw = po against an 
alternative Hy: ~ = m, can be made as small as one 
pleases (not zero!) by taking the sample sufficiently 
large. 


Test for equality of population means against the 
alternative that the means are different assuming 
normality, choosing a = 5% and using two samples of 
sizes 12 and 18, with mean 10 and 14, respectively, 
and equal standard deviation 3. 


25.5 Quality Control 


The ideas on testing can be adapted and extended in various ways to serve basic practical 
needs in engineering and other fields. We show this in the remaining sections for some 
of the most important tasks solvable by statistical methods. As a first such area of problems, 
we discuss industrial quality control, a highly successful method used in various industries. 
No production process is so perfect that all the products are completely alike. There is 
always a small variation that is caused by a great number of small, uncontrollable factors 
and must therefore be regarded as a chance variation. It is important to make sure that the 
products have required values (for example, length, strength, or whatever property may 
be essential in a particular case). For this purpose one makes a test of the hypothesis that 
the products have the required property, say, 4 = fo, where Mo is a required value. If 
this is done after an entire lot has been produced (for example, a lot of 100,000 screws), 
the test will tell us how good or how bad the products are, but it it obviously too late to 
alter undesirable results. It is much better to test during the production run. This is done 
at regular intervals of time (for example, every hour or half-hour) and is called quality 
control. Each time a sample of the same size is taken, in practice 3 to 10 times. If the 
hypothesis is rejected, we stop the production and look for the cause of the trouble. 
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If we stop the production process even though it is progressing properly, we make a 
Type I error. If we do not stop the process even though something is not in order, we 
make a Type II error (see Sec. 25.4). The result of each test is marked in graphical form 
on what is called a control chart. This was proposed by W. A. Shewhart in 1924 and 
makes quality control particularly effective. 


Control Chart for the Mean 


An illustration and example of a control chart is given in the upper part of Fig. 537. This 
control chart for the mean shows the lower control limit LCL, the center control line 
CL, and the upper control limit UCL. The two control limits correspond to the critical 
values c; and cg in case (c) of Example 2 in Sec. 25.4. As soon as a sample mean falls 
outside the range between the control limits, we reject the hypothesis and assert that the 


4.20 
0.5% 
4.15 —ucL—} 
Cc 
® 4.10 NWA CL 99% 
= 
4.05 LoL—f 
0.5% 
4.00 
Sample no. 5 10 
0.04 1% 
0.0365 —vuce_—} 
0.03 
5 
& 
a 
no) 
= 0.02 
7 99% 
™! 
Cc 
s 
n 
0.01 
(6) Y 
Sample no. 5 10 


Fig. 537. Control charts for the mean (upper part of figure) and 
the standard deviation in the case of the samples on p. 1089 
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production process is “out of control”; that is, we assert that there has been a shift in 
process level. Action is called for whenever a point exceeds the limits. 

If we choose control limits that are too loose, we shall not detect process shifts. On the 
other hand, if we choose control limits that are too tight, we shall be unable to run the 
process because of frequent searches for nonexistent trouble. The usual significance level 
is a = 1%. From Theorem | in Sec. 25.3 and Table A8 in App. 5 we see that in the case 
of the normal distribution the corresponding control limits for the mean are 


oO Oo 
(1) LCL = = 233 —=; UCL = oF aks == 
Ho Ne Ko Ve 


Here o is assumed to be known. If o is unknown, we may compute the standard deviations 
of the first 20 or 30 samples and take their arithmetic mean as an approximation of a. 
The broken line connecting the means in Fig. 537 is merely to display the results. 

Additional, more subtle controls are often used in industry. For instance, one observes 
the motions of the sample means above and below the centerline, which should happen 
frequently. Accordingly, long runs (conventionally of length 7 or more) of means all above 
(or all below) the centerline could indicate trouble. 


Table 25.5 Twelve Samples of Five Values Each 
(Diameter of Small Cylinders, Measured in Millimeters) 


Sample = 
Nesuber Sample Values x Si R 
1 4.06 4.08 4.08 4.08 4.10 4.080 0.014 0.04 
2 4.10 4.10 4.12 4.12 4.12 4.112 0.011 0.02 
3 4.06 4.06 4.08 4.10 4.12 4.084 0.026 0.06 
4 4.06 4.08 4.08 4.10 4.12 4.088 0.023 0.06 
> 4.08 4.10 4.12 4.12 4.12 4.108 0.018 0.04 
6 4.08 4.10 4.10 4.10 4.12 4.100 0.014 0.04 
7 4.06 4.08 4.08 4.10 4.12 4.088 0.023 0.06 
8 4.08 4.08 4.10 4.10 4.12 4.096 0.017 0.04 
9 4.06 4.08 4.10 4.12 4.14 4.100 0.032 0.08 
10 4.06 4.08 4.10 4.12 4.16 4.104 0.038 0.10 
11 4.12 4.14 4.14 4.14 4.16 4.140 0.014 0.04 
12 4.14 4.14 4.16 4.16 4.16 4.152 0.011 0.02 


Control Chart for the Variance 


In addition to the mean, one often controls the variance, the standard deviation, or the range. 
To set up a control chart for the variance in the case of a normal distribution, we may employ 
the method in Example 4 of Sec. 25.4 for determining control limits. It is customary to use only 
one control limit, namely, an upper control limit. Now from Example 4 of Sec. 25.4 we have 
ee ony, /(n — 1), where, because of our normality assumption, the random variable Y has a 
chi-square distribution with n — 1 degrees of freedom. Hence the desired control limit is 


2 
oc 
=] 


(2) UL, = 
n 
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where c is obtained from the equation 
P(Y>c)= a, that is, PYSc=1-a 


and the table of the chi-square distribution (Table A10 in App. 5) with n — 1 degrees of 
freedom (or from your CAS); here a (5% or 1%, say) is the probability that in a properly 
running process an observed value s of S? is greater than the upper control limit. 

If we wanted a control chart for the variance with both an upper control limit UCL and 
a lower control limit LCL, these limits would be 


Cc 
(3) LcL.=—_— and UCL = 


n-1 n-1? 


where c, and cg are obtained from Table A10 with n — 1 d.f. and the equations 


(4) P(Y Sc) = = and P(Y So) =1- a 


Control Chart for the Standard Deviation 


To set up a control chart for the standard deviation, we need an upper control limit 


aVc 


5 UCL = ——— 
(5) Ra 


obtained from (2). For example, in Table 25.5 we have n = 5. Assuming that the 
corresponding population is normal with standard deviation o = 0.02 and choosing 
a = 1%, we obtain from the equation 


PY Sc)=1-a=99% 


and Table A10 in App. 5 with 4 degrees of freedom the critical value c = 13.28 and from 
(5) the corresponding value 


0.02 V 13.28 


UCL = = 0.0365, 


which is shown in the lower part of Fig. 537. 
A control chart for the standard deviation with both an upper and a lower control limit 
is obtained from (3). 


Control Chart for the Range 


Instead of the variance or standard deviation, one often controls the range R (= largest 
sample value minus smallest sample value). It can be shown that in the case of the normal 
distribution, the standard deviation o is proportional to the expectation of the random 
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variable R* for which R is an observed value, say, o = A,,E(R*) where the factor of 
proportionality A,, depends on the sample size n and has the values 


n 2 3 4 5 6 7 8 9 10 
An = o/E(R*) 0.89 059 049 043 040 037 0.35 0.34 0.32 
n 1 14 16 18 20 30 40 50 
An = o/E(R*) 0.31 0.29 0.28 0.28 0.27 0.25 0.23 0.22 


Since R depends on two sample values only, it gives less information about a sample 
than s does. Clearly, the larger the sample size n is, the more information we lose in using 
R instead of s. A practical rule is to use s when n is larger than 10. 


PROBLEM SET 25-5 


1. Suppose a machine for filling cans with lubricating 7. Graph the ranges of the samples in Prob. 6 on a control 
oil is set so that it will generate fillings which form chart for ranges. 
a normal population with mean 1 gal and standard 8. Graph A, = o/E(R*) as a function of n. Why is Ap a 
deviation 0.02 gal. Set up a control chart of the monotone decreasing function of n? 
type shown in Fig. 537 for controlling the mean, that j 


is, find LCL and UCL, assuming that the sample size 9. Eight samples of size 2 were taken from a lot of screws. 
is 4. The values (length in inches) are 


2. Three-sigma control chart. Show that in Prob. 1, the Sample No. 1 2 3 4 5 6 7 8 
requirement of the significance level a = 0.3% leads 
toLCL = p — 30/Vnand UCL = p + 30/Va, and enn 3.50 3.51 3.49 3.52 3.53 3.49 3.48 3.52 
find the corresponding numeric values. 3.51 3.48 3.50 3.50 3.49 3.50 3.47 3.49 


3. What sample size should we choose in Prob. 1 if we Assuming that the population is normal with mean 
want LCL and UCL somewhat closer together, say, 3.500 and variance 0.0004 and using (1), set up a 
UCL — LCL = 0.02, without changing the signifi- control chart for the mean and graph the sample means 
cance level? on the chart. 


4. What effect on UCL — LCL does it have if we double —_—_ 10, Attribute control charts. Fifteen samples of size 100 
the sample size? If we switch from a= 1% to were taken from a production of containers. The 
a=5%? numbers of defectives (leaking containers) in those 


5. How should we change the sample size in controlling samples (in the order observed) were 


the mean of a normal population if we want 145 49 70 5 6 13 02 1 12 8 


UCL — LCL to decrease to half its original value? . ; : 
From previous experience it was known that the 


6. Graph the means of the following 10 samples average fraction defective is p = 4% provided that 
(thickness of gaskets, coded values) on a control chart the process of production is running properly. Using 
for means, assuming that the population is normal with the binomial distribution, set up a fraction defective 
mean 5 and standard deviation 1.16. chart (also called a p-chart), that is, choose the 


Time 10:00 11:00 12:00 13:00 14:00 15:00 16:00 17:00 18:00 19:00 


5 7 7 4 5 6 5 5 3 3 
Sample 2 5 3 4 6 4 5 2 4 6 
values 5 4 6 3 4 6 6 5 8 6 
6 4 5 6 6 4 4 3 4 8 
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LCL = 0 and determine the UCL for the fraction 
defective (in percent) by the use of 3-sigma limits, 
where o” is the variance of the random variable 

X = Fraction defective in a sample of size 100. 
Is the process under control? 
Number of defectives. Find formulas for the UCL, CL, 
and LCL (corresponding to 3a-limits) in the case of a 
control chart for the number of defectives, assuming 
that, in a state of statistical control, the fraction of 
defectives is p. 
CAS PROJECT. Control Charts. (a) Obtain 100 
samples of 4 values each from the normal distribution 
with mean 8.0 and variance 0.16 and their means, 
variances, and ranges. 
(b) Use these samples for making up a control chart 
for the mean. 
(c) Use them on a control chart for the standard 
deviation. 
(d) Make up a control chart for the range. 


(e) Describe quantitative properties of the samples 
that you can see from those charts (e.g., whether the 
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14. 


15. 


corresponding process is under control, whether the 
quantities observed vary randomly, etc.). 


Since the presence of a point outside control limits for 
the mean indicates trouble, how often would we be 
making the mistake of looking for nonexistent trouble 
if we used (a) l1-sigma limits, (b) 2-sigma limits? 
Assume normality. 

What LCL and UCL should we use instead of (1) if, 
instead of X, we use the sum x, + -::+ x, of the 
sample values? Determine these limits in the case of 
Fig. 537. 

Number of defects per unit. A so-called c-chart or 
defects-per-unit chart is used for the control of the 
number X of defects per unit (for instance, the number 
of defects per 100 meters of paper, the number of 
missing rivets in an airplane wing, etc.). (a) Set up 
formulas for CL and LCL, UCL corresponding to 
pw + 30, assuming that X has a Poisson distribution. 
(b) Compute CL, LCL, and UCL in a control process 
of the number of imperfections in sheet glass; assume 
that this number is 3.6 per sheet on the average when 
the process is in control. 


25.6 Acceptance Sampling 


Acceptance sampling is usually done when products leave the factory (or in some cases 
even within the factory). The standard situation in acceptance sampling is that a producer 
supplies to a consumer (a buyer or wholesaler) a lot of N items (a carton of screws, for 
instance). The decision to accept or reject the lot is made by determining the number x 
of defectives (= defective items) in a sample of size n from the lot. The lot is accepted 
if x =c, where c is called the acceptance number, giving the allowable number of 
defectives. If x > c, the consumer rejects the lot. Clearly, producer and consumer must 
agree on a certain sampling plan giving n and c. 

From the hypergeometric distribution we see that the event A: “Accept the lot” has 


probability (see Sec. 24.7) 


() 


rovemvao SNC) 


0) 


where M is the number of defectives in a lot of N items. In terms of the fraction defective 


0 = M/N we can write (1) as 


(2) 


(S 


PAO) = 


x=0 


eG) 


P(A; 6) can assume n + 1 values corresponding to 6 = 0, 1/N, 2/N,-:-, N/N; here, n and 
c are fixed. A monotone smooth curve through these points is called the operating 
characteristic curve (OC curve) of the sampling plan considered. 
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EXAMPLE 1 


EXAMPLE 2 


Sampling Plan 


Suppose that certain tool bits are packaged 20 to a box, and the following sampling plan is used. A sample of 
two tool bits is drawn, and the corresponding box is accepted if and only if both bits in the sample are good. 
In this case, N = 20,n = 2,c = 0, and (2) takes the form (a factor 2 drops out) 


P(A; 6) = © *) ‘7 7 20 (2) 


(20 — 20 4)(19 — 204) 
380 , 


The values of P(A, 6) for 6 = 0, 1/20, 2/20,---, 20/20 and the resulting OC curve are shown in Fig. 538. 
(Verify!) |_| 


P(A;@) 0.5 P(A; 6) 0.5 


0) 0.2 


(7) 


Fig. 538. OC curve of the sampling plan with n = 2 Fig. 539. OC curve in Example 2 
and c = 0 for lots of size N = 20 


In most practical cases 6 will be small (less than 10%). Then if we take small samples 
compared to N, we can approximate (2) by the Poisson distribution (Sec. 24.7); thus 


it 
ar (uw = n6). 
x! 


(3) PAO ~e"> 
x=0 


Sampling Plan. Poisson Distribution 


Suppose that for large lots the following sampling plan is used. A sample of size n = 20 is taken. If it contains 
not more than one defective, the lot is accepted. If the sample contains two or more defectives, the lot is rejected. 
In this plan, we obtain from (3) 


P(A; 0) ~ e 29° + 208), 


The corresponding OC curve is shown in Fig. 539. i) 


Errors in Acceptance Sampling 


We show how acceptance sampling fits into general test theory (Sec. 25.4) and what this 
means from a practical point of view. The producer wants the probability a of rejecting 
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P(A;@) _ 
95% 


Producer's risk 


50% + 


Consumer's risk 
p=15% 


15% ee ee 
l 1 1 
0 6 0, 
=1% = 5% 
Good ! Indifference ! Poor 
material ; zone | material 


Fig. 540. OC curve, producer’s and consumer’s risks 


an acceptable lot (a lot for which @ does not exceed a certain number 09 on which the 
two parties agree) to be small. 9 is called the acceptable quality level (AQL). Similarly, 
the consumer (the buyer) wants the probability 6 of accepting an unacceptable lot (a lot 
for which 6 is greater than or equal to some 6,) to be small. 0, is called the lot tolerance 
percent defective (LTPD) or the rejectable quality level (RQL). a is called producer’s 
risk. It corresponds to a Type I error in Sec. 25.4. B is called consumer’s risk and 
corresponds to a Type II error. Figure 540 shows an example. We see that the points 
(09, 1 — @) and (64, B) lie on the OC curve. It can be shown that for large lots we can 
choose 49, 61 (> 99), a, 8 and then determine n and c such that the OC curve runs very 
close to those prescribed points. Table 25.6 shows the analogy between acceptance 
sampling and hypothesis testing in Sec. 25.4. 


Table 25.6 Acceptance Sampling and Hypothesis Testing 


Acceptance Sampling Hypothesis Testing 
Acceptable quality level (AQL) 6 = 60 Hypothesis 0 = 09 
Lot tolerance percent defectives (LTPD) Nemaee =o, 
6= 04 
Allowable number of defectives c Critical value c 
Producer’s risk a of rejecting a lot Probability a of making a Type I error 
with 6 S 00 (significance level) 


Consumer’s risk 6 of accepting a lot 


with 6 = 0, Probability 8 of making a Type II error 


Rectification 


Rectification of a rejected lot means that the lot is inspected item by item and all defectives 
are removed and replaced by nondefective items. (This may be too expensive if the lot is 
cheap; in this case the lot may be sold at a cut-rate price or scrapped.) If a production 
turns out 1000% defectives, then in K lots of size N each, KN@ of the KN items are 
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defectives. Now KP(A; 0) of these lots are accepted. These contain KPN@ defectives, 
whereas the rejected and rectified lots contain no defectives, because of the rectification. 
Hence after the rectification the fraction defective in all K lots equals KPNO/KN. This is 
called the average outgoing quality (AOQ); thus 


(4) 


AOQ(@) = 6P(A; 6). 


Figure 541 shows an example. Since AOQ(0) = 0 and P(A; 1) = 0, the AOQ curve has 
a maximum at some 6 = 6*, giving the average outgoing quality limit (AOQL). This is 
the worst average quality that may be expected to be accepted under rectification. 


0.5 


Fig. 541. 


OC curve 


AQQ curve 


OC curve and AOQ curve for the sampling plan in Fig. 538 


PROBLEM SET 25-6 


1. 


Lots of kitchen knives are inspected by a sampling plan 
that uses a sample of size 20 and the acceptance number 
c = 1. What is the probability of accepting a lot with 
1%,2%,10% defectives (knives with dull blades)? 
Use Table A6 of the Poisson distribution in App. 5. 
Graph the OC curve. 


. What happens in Prob. 1 if the sample size is increased 


to 50? First guess. Then calculate. Graph the OC curve 
and compare. 


. How will the probabilities in Prob. 1 with n = 20 


change (up or down) if we decrease c to zero? First 
guess. 


. What are the producer’s and consumer’s risks in 


Prob. 1 if the AQL is 2% and the RQL is 15%? 


. Lots of copper pipes are inspected according to a 


sample plan that uses sample size 25 and acceptance 
number |. Graph the OC curve of the plan, using the 


10. 


Poisson approximation. Find the producer’s risk if the 
AQL is 1.5%. 


. Graph the AOQ curve in Prob. 5. Determine the AOQL, 


assuming that rectification is applied. 


. In Example 1 in the text, what are the producer’s and 


consumer’s risks if the AQL is 0.1 and the RQL is 0.6? 


. What happens in Example 1 in the text if we increase 


the sample size to n = 3, leaving the other data as 
before? Compute P(A; 0.1) and P(A; 0.2) and compare 
with Example 1. 


. Graph and compare sampling plans with c = 1 and 


increasing values of n, say, n = 2,3,4. (Use the 
binomial distribution.) 


Find the binomial approximation of the hypergeometric 
distribution in Example | in the text and compare the 
approximate and the accurate values. 
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11. Samples of 3 fuses are drawn from lots and a lot is 13. What is the consumer’s risk in Prob. 12 if we want the 
accepted if in the corresponding sample we find no RQL to be 12%? Use c =9 from the answer of 
more than | defective fuse. Criticize this sampling plan. Prob. 12. 


In particular, find the probability of accepting a lot 
that is 50% defective. (Use the binomial distribution 
(7), Sec. 24.7.) 

12. If in a sampling plan for large lots of spark plugs, the 
sample size is 100 and we want the AQL to be 5% and 15. Graph the OC curve and the AOQ curve for the single 
the producer’s risk 2%, what acceptance number c sampling plan for large lots with n = 5 and c = 0, and 
should we choose? (Use the normal approximation of find the AOQL. 
the binomial distribution in Sec. 24.8.) 


14. A lot of batteries for wrist watches is accepted if and 
only if a sample of 20 contains at most 1 defective. 
Graph the OC and AOQ curves. Find AOQL. [Use (3).] 


25.7 Goodness of Fit. 7 -Test 


To test for goodness of fit means that we wish to test that a certain function F(x) is the 
distribution function of a distribution from which we have a sample x1,---,x,. Then we 
test whether the sample distribution function F(x) defined by 


F(x) = Sum of the relative frequencies of all sample values x; not exceeding x 


fits F(x) “sufficiently well.” If this is so, we shall accept the hypothesis that F(x) is the 
distribution function of the population; if not, we shall reject the hypothesis. 

This test is of considerable practical importance, and it differs in character from the 
tests for parameters (2, o”, etc.) considered so far. 

To test in that fashion, we have to know how much F (x) can differ from F(x) if the 
hypothesis is true. Hence we must first introduce a quantity that measures the deviation 
of F(x) from F(x), and we must know the probability distribution of this quantity under 
the assumption that the hypothesis is true. Then we proceed as follows. We determine 
a number c such that, if the hypothesis is true, a deviation greater than c has a small 
preassigned probability. If, nevertheless, a deviation greater than c occurs, we have reason 
to doubt that the hypothesis is true and we reject it. On the other hand, if the deviation 
does not exceed c, so that F(x) approximates F(x) sufficiently well, we accept the 
hypothesis. Of course, if we accept the hypothesis, this means that we have insufficient 
evidence to reject it, and this does not exclude the possibility that there are other functions 
that would not be rejected in the test. In this respect the situation is quite similar to that 
in Sec. 25.4. 

Table 25.7 shows a test of that type, which was introduced by R. A. Fisher. This 
test is justified by the fact that if the hypothesis is true, then xb is an observed value 
of a random variable whose distribution function approaches that of the chi-square 
distribution with K — | degrees of freedom (or K — r — | degrees of freedom if r 
parameters are estimated) as n approaches infinity. The requirement that at least five 
sample values lie in each interval in Table 25.7 results from the fact that for finite 
n that random variable has only approximately a chi-square distribution. A proof can 
be found in Ref. [G3] listed in App. 1. If the sample is so small that the requirement 
cannot be satisfied, one may continue with the test, but then use the result with 
caution. 
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Table 25.7. Chi-square Test for the Hypothesis That F(x) is the Distribution Function 
of a Population from Which a Sample x,, - - - , x,, is Taken 


Step 1. 


Step 2. 


Step 3. 


() 


Step 4. 
Step 5. 


Subdivide the x-axis into K intervals J, /o,---, J such that each interval contains 
at least 5 values of the given sample x1, +++, x. Determine the number b; of sample 
values in the interval J;, where j = 1,---, K. If a sample value lies at a common 
boundary point of two intervals, add 0.5 to each of the two corresponding Jj. 


Using F(x), compute the probability p; that the random variable X under 
consideration assumes any value in the interval J;, where j = 1,---, K. Compute 
ej = npj. 


(This is the number of sample values theoretically expected in J; if the hypothesis 
is true.) 


Compute the deviation 
2 
g- >So? 
: i 
j=l 
Choose a significance level (5%, 1%, or the like). 


Determine the solution c of the equation 
P(X? Sc)=1-a 


from the table of the chi-sqare distribution with K — | degrees of freedom (Table 
A10 in App. 5). If r parameters of F(x) are unknown and their maximum likelihood 
estimates (Sec. 25.2) are used, then use K — r — 1 degrees of freedom (instead 
of K — 1). If x8 Sc, accept the hypothesis. If y% > c, reject the hypothesis. 


Table 25.8 Sample of 100 Values of the Splitting Tensile Strength (Ib/in.”) 
of Concrete Cylinders 


320 380 340 410 380 340 360 350 320 370 
350 340 350 360 370 350 380 370 300 420 
370 390 390 440 330 390 330 360 400 370 
320 350 360 340 340 350 350 390 380 340 
400 360 350 390 400 350 360 340 370 420 
420 400 350 370 330 320 390 380 400 370 
390 330 360 380 350 330 360 300 360 360 
360 390 350 370 370 350 390 370 370 340 
370 400 360 350 380 380 360 340 330 370 
340 360 390 400 370 410 360 400 340 360 


D. L. IVEY, Splitting tensile tests on structural lightweight aggregate concrete. Texas Transportation 
Institute, College Station, Texas. 


Test of Normality 


Test whether the population from which the sample in Table 25.8 was taken is normal. 


Solution. Table 25.8 shows the values (column by column) in the order obtained in the experiment. Table 
25.9 gives the frequency distribution and Fig. 542 the histogram. It is hard to guess the outcome of the test— 


does the histogram resemble a normal density curve sufficiently well or not? 
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The maximum likelihood estimates for ~ and o” are ju= X = 364.7 and &* = 712.9. The computation in 
Table 25.10 yields x3 = 2.688. It is very interesting that the interval 375 ---385 contributes over 50% of x@. 
From the histogram we see that the corresponding frequency looks much too small. The second largest 
contribution comes from 395 ---405, and the histogram shows that the frequency seems somewhat too large, 
which is perhaps not obvious from inspection. 


Table 25.9 Frequency Table of the Sample in Table 25.8 


1 2} 3 4 5 
Tensile Absolute Relative Cumulative Cumulative 
Strength Frequency Frequency Absolute Relative 
x Frequency Frequency 
[Ib/in.?] F(x) F(x) 
300 2 0.02 2 0.02 
310 0 0.00 2 0.02 
320 4 0.04 6 0.06 
330 6 0.06 12 0.12 
340 11 0.11 23 0.23 
350 14 0.14 37 0.37 
360 16 0.16 53 0.53 
370 15 0.15 68 0.68 
380 8 0.08 76 0.76 
390 10 0.10 86 0.86 
400 8 0.08 94 0.94 
410 2 0.02 96 0.96 
420 3 0.03 99 0.99 
430 0 0.00 99 0.99 
440 1 0.01 100 1.00 


We choose a = 5%. Since K = 10 and we estimated r = 2 parameters we have to use Table A10 in App. 5 
with K — r — | = 7 degrees of freedom. We find c = 14.07 as the solution of PQ? Sc) = 95%. Since xé S103 
we accept the hypothesis that the population is normal. 


0.20 


Fix) 


0.10 


0.05 


0 | | | 
250 300 350 400 450 


[lb./in.7] 


Fig. 542. Frequency histogram of the sample in Table 25.8 


1. Verify the calculations in Example 1 of the text. 


. If it is known that 25% of certain steel rods produced 
by a standard process will break when subjected to a 
load of 5000 Ib, can we claim that a new, less expensive 
process yields the same breakage rate if we find that in 
a sample of 80 rods produced by the new process, 27 
rods broke when subjected to that load? (Use a = 5%.) 
. If 100 flips of a coin result in 40 heads and 60 tails, 
can we assert on the 5% level that the coin is fair? 

. Ifin 10 flips of a coin we get the same ratio as in Prob. 3 
(4 heads and 6 tails), is the conclusion the same as in 
Prob. 3? First conjecture, then compute. 

. Can you claim, on a 5% level, that a die is fair if 60 
trials give 1,---, 6 with absolute frequencies 10, 13, 9, 
11, 9, 8? 

. Solve Prob. 5 if rolling a die 180 times gives 33, 27, 
29,35, 25; 31: 

. If a service station had served 60, 49, 56, 46, 68, 39 
cars from Monday through Friday between | PM. and 
2 P.M., can one claim on a 5% level that the differences 
are due to randomness? First guess. Then calculate. 

. A manufacturer claims that in a process of producing 
drill bits, only 2.5% of the bits are dull. Test the claim 
against the alternative that more than 2.5% of the bits 
are dull, using a sample of 400 bits containing 17 dull 
ones. Use a = 5%. 

. In a table of properly rounded function values, even 
and odd last decimals should appear about equally 
often. Test this for the 90 values of Jy(x) in Table Al 
in App. 5. 


11. 


12. 
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Table 25.10 Computations in Example 1 
; — 364.7 x; — 364.7 
Xj a o( j ) ¢ == 2} Term in (1) 
26.7 26.7 

—o +++ 325 —«% +++—149 0.0000 - - - 0.0681 6.81 6 0.096 

325 -- - 335 —149---—-1.11 0.0681 - - - 0.1335 6.54 6 0.045 

335+ ++ 345 —1.11 +--+ —0.74 0.1335 - - + 0.2296 9.61 11 0.201 

345 +--+ 355 —0.74 +--+ —-0.36 0.2296--- 0.3594 12.98 14 0.080 

355°** «365 —0.36 - 0.01 0.3594 --- 0.5040 14.46 16 0.164 

365 +++ 375 0.01 - 0.39 0.5040 --- 0.6517 = 14.77 15 0.0004 

375 +++ 385 0.39 - 0.76 0.6517 +--+ 0.7764 = 12.47 8 1.602 

385 - ++ 395 0.76 - 1.13 0.7764 - - - 0.8708 9.44 10 0.033 

395 --- 405 1.13:- 1.51 0.8708 - - - 0.9345 6.37 8 0.417 

405 +--+ 0% 151-++-+ © 0.9345 - - - 1.0000 6.55 0.046 
X5 = 2.688 
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TEAM PROJECT. Difficulty with Random 
Selection. 77 students were asked to choose 3 of the 
integers 11, 12, 13,---, 30 completely arbitrarily. The 
amazing result was as follows. 


Number 11 12 13 14 15 16 17 18 19 20 


Frequ. 11 10 20 8 13 9 21 9 16 8 


Number 21 22 23 24 25 26 27 28 29 30 


Frequ. 12 8 15 10 10 9 12 8 13 9 


If the selection were completely random, the following 
hypotheses should be true. 

(a) The 20 numbers are equally likely. 

(b) The 10 even numbers together are as likely as the 
10 odd numbers together. 

(c) The 6 prime numbers together have probability 0.3 
and the 14 other numbers together have probability 0.7. 
Test these hypotheses, using a = 5%. Design further 
experiments that illustrate the difficulties of random 
selection. 

CAS EXPERIMENT. Random Number Generator. 
Check your generator experimentally by imitating 
results of 7 trials of rolling a fair die, with a convenient 
n (e.g., 60 or 300 or the like). Do this many times and 
see whether you can notice any “nonrandomness” 
features, for example, too few Sixes, too many even 
numbers, etc., or whether your generator seems to work 
properly. Design and perform other kinds of checks. 
Test for normality at the 1% level using a sample of 
n = 79 (rounded) values x (tensile strength [kg/ mm?] 


1100 


13. 


of steel sheets of 0.3 mm thickness). a = a(x) = 
absolute frequency. (Take the first two values together, 
also the last three, to get K = 5.) 


x | 57 58 59 60 61 62 63 += 64 


a | 4 10 dy. 27 8 9 3 1 


Mendel’s pathbreaking experiments. In a famous 
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the weeks, and more than 2 accidents in no week? 
Choose a = 5%. 


Radioactivity. Rutherford-Geiger experiments. 
Using the given sample, test that the corresponding 
population has a Poisson distribution. x is the number 
of alpha particles per 7.5-s intervals observed by 
E. Rutherford and H. Geiger in one of their classical 
experiments in 1910, and a(x) is the absolute frequency 


plant-crossing experiment, the Austrian Augustinian 
father Gregor Mendel (1822-1884) obtained 355 
yellow and 123 green peas. Test whether this agrees 
with Mendel’s theory according to which the ratio 
should be 3:1. 


(= number of time periods during which exactly x 
particles were observed). Use a = 5%. 


14. Accidents in a foundry. Does the random variable a | 57 203, 383 525, 532, 408 273 
x = Number of accidents per week have a Poisson - 7 8 9 10 W 12 >13 
distribution if, within 50 weeks, 33 were accident-free, 

1 accident occurred in 11 of the 50 weeks, 2 in 6 of a 139 45 27 10 4 2 0 


25.8 Nonparametric Tests 


Nonparametric tests, also called distribution-free tests, are valid for any distribution. 
Hence they are used in cases when the kind of distribution is unknown, or is known but 
such that no tests specifically designed for it are available. In this section we shall explain 
the basic idea of these tests, which are based on “order statistics” and are rather simple. 
If there is a choice, then tests designed for a specific distribution generally give better 
results than do nonparametric tests. For instance, this applies to the tests in Sec. 25.4 for 
the normal distribution. 

We shall discuss two tests in terms of typical examples. In deriving the distributions 
used in the test, it is essential that the distributions, from which we sample, are continuous. 
(Nonparametric tests can also be derived for discrete distributions, but this is slightly more 
complicated.) 


EXAMPLE 1 _ Sign Test for the Median 


A median of the population is a solution x = @ of the equation F(x) = 0.5, where F is the distribution function 
of the population. 

Suppose that eight radio operators were tested, first in rooms without air-conditioning and then in air-conditioned 
rooms over the same period of time, and the difference of errors (unconditioned minus conditioned) were 


9 4 0 6 4 0 7 #11. 


Test the hypothesis 4 = 0 (that is, air-conditioning has no effect) against the alternative 1 > 0 (that is, inferior 
performance in unconditioned rooms). 


Solution. We choose the significance level a = 5%. If the hypothesis is true, the probability p of a positive 
difference is the same as that of a negative difference. Hence in this case, p = 0.5, and the random variable 


X = Number of positive values among n values 


has a binomial distribution with p = 0.5. Our sample has eight values. We omit the values 0, which do not 
contribute to the decision. Then six values are left, all of which are positive. Since 


P(X = 6) = @ (0.5)®(0.5)° 


= 0.0156 
= 1.56% 
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we have observed an event whose probability is very small if the hypothesis is true; in fact 1.56% <a = 5%. 
Hence we assert that the alternative > 0 is true. That is, the number of errors made in unconditioned rooms 
is significantly higher, so that installation of air conditioning should be considered. a3] 


Test for Arbitrary Trend 


A certain machine is used for cutting lengths of wire. Five successive pieces had the lengths 
29 31 28 30 = 32. 


Using this sample, test the hypothesis that there is no trend, that is, the machine does not have the tendency to 
produce longer and longer pieces or shorter and shorter pieces. Assume that the type of machine suggests the 
alternative that there is positive trend, that is, there is the tendency of successive pieces to get longer. 


Solution. We count the number of transpositions in the sample, that is, the number of times a larger value 
precedes a smaller value: 


29 precedes 28 (1 transposition), 


31 precedes 28 and 30 —_ (2 transpositions). 


The remaining three sample values follow in ascending order. Hence in the sample there are 1 + 2 = 3 
transpositions. We now consider the random variable 


T = Number of transpositions. 


If the hypothesis is true (no trend), then each of the 5! = 120 permutations of five elements 1 2 3 4 5 has the 
same probability (1/120). We arrange these permutations according to their number of transpositions: 


io co T=2 T=3 
2345 L 2 354 i 24-5 3 i246 43 
1243 5 1253 4 a 4 5D 

L a2 a4 1325 4 Los 3-4 

2 4 3.44 13 42 5 L ao a3 

I 4 a as i 4 32:5 

21 3 54 1523 4 

2,14 3 5 2 As a 

23.1 4 3 215 3 4 ete. 

a1 2 a& 5 23 14a 

2 34 1.5 

72.4 34 

8. i 25.4 

st a2 5 

$2 1 25 

4142435 


From this we obtain 


log 4 , 15 29 
PIT = 3) 120 ' 120 ' 120 ' 120 120 24%. 


We accept the hypothesis because we have observed an event that has a relatively large probability (certainly 
much more than 5%) if the hypothesis is true. 

Values of the distribution function of T in the case of no trend are shown in Table A12, App. 5. For instance, 
if n = 3, then F(O) = 0.167, F(1) = 0.500, F(2) = 1 — 0.167. If n = 4, then F(O) = 0.042, F(1) = 0.167, 
F(2) = 0.375, F(3) = 1 — 0.375, F(4) = 1 — 0.167, and so on. 


1102 


CHAP. 25 Mathematical Statistics 


Our method and those values refer to continuous distributions. Theoretically, we may then expect that all the 
values of a sample are different. Practically, some sample values may still be equal, because of rounding: If m 
values are equal, add m(m — 1)/4 (= mean value of the transpositions in the case of the permutations of m 
elements), that is, 3 for each pair of equal values, 3 for each triple, etc. ii] 


PROBLEEM—SET 25-8 


1. 


What would change in Example 1 had we observed 
only 5 positive values? Only 4? 


. Test & = 0 against px > 0, using 1, —1, 1, 3, —8,6,0 


(deviations of the azimuth [multiples of 0.01 radian] in 
some revolution of a satellite). 


. Are oil filters of type A better than type B filters if in 


11 trials, A gave cleaner oil than B in 7 cases, B gave 
cleaner oil than A in | case, whereas in 3 of the trials 
the results for A and B were practically the same? 


. Does a process of producing stainless steel pipes of 


length 20 ft for nuclear reactors need adjustment if, in a 
sample, 4 pipes have the exact length and 15 are shorter 
and 3 longer than 20 ft? Use the normal approximation 
of the binomial distribution. 


. Do the computations in Prob. 4 without the use of the 


DeMoivre—Laplace limit theorem in Sec. 24.8. 


. Thirty new employees were grouped into 15 pairs of 


similar intelligence and experience and were then 
instructed in data processing by an old method (A) 
applied to one (randomly selected) person of each pair, 
and by a new presumably better method (B) applied to 
the other person of each pair. Test for equality of 
methods against the alternative that (B) is better than 
(A), using the following scores obtained after the end 
of the training period. 


A) 60 70 80 85 75 40 70 45 95 80 90 60 80 75 65 


B65 85 85 80 95 65 100 60 90 85 100 75 90 60 80 


7. 


8. 


Assuming normality, solve Prob. 6 by a suitable test 
from Sec. 25.4. 


In a clinical experiment, each of 10 patients were given 
two different sedatives A and B. The following table 
shows the effect (increase of sleeping time, measured 
in hours). Using the sign test, find out whether the 
difference is significant. 


A 19 O08 Lt O11 -O.1 44 5.5 16 46 3.4 
B 0.7 -16 —0.2 -—1.2 —0.1 3.4 3.7 0.8 0.0 2.0 


Difference 1.2 2.4 1.3 13 


0.0 1.0 1.8 0.8 4.6 1.4 


9. 


10. 


11. 


12. 


13. 


14. 


15. 


Assuming that the populations corresponding to the 
samples in Prob. 8 are normal, apply a suitable test for 
the normal distribution. 


Test whether a thermostatic switch is properly set to 
50°C against the alternative that its setting is too low. 
Use a sample of 9 values, 8 of which are less than 50°C 
and | is greater. 


How would you proceed in the sign test if the 
hypothesis is &@# = {£9 (any number) instead of pp = 0? 
Test the hypothesis that, for a certain type of voltmeter, 
readings are independent of temperature T [°C] against 
the alternative that they tend to increase with T. Use 
a sample of values obtained by applying a constant 
voltage: 


Temperature T [°C] 10 20 30 40 £50 


Reading V [volts] 199.5 101.1 100.4 100.8 101.6 


Does the amount of fertilizer increase the yield of 
wheat X [kg/plot]? Use a sample of values ordered 
according to increasing amounts of fertilizer: 


33.4 35.3 31.6 35.0 36.1 37.6 36.5 38.7. 


Apply the test explained in Example 2 to the following 
data (x = diastolic blood pressure [mm Hg], y = 
weight of heart [in grams] of 10 patients who died of 
cerebral hemorrhage). 


x {121 120 95 123 140 112 92 100 102 91 


y [521 465 352 455 490 388 301 395 375 418 


Does an increase in temperature cause an increase of 
the yield of a chemical reaction from which the 
following sample was taken? 


10 20 
0.6 1.1 


Temperature [°C] 30 40 60 80 


Yield [kg/min] 0.9 16 1.2 2.0 
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25.9 Regression. Fitting Straight Lines. 
Correlation 


So far we were concerned with random experiments in which we observed a single quantity 
(random variable) and got samples whose values were single numbers. In this section we 
discuss experiments in which we observe or measure two quantities simultaneously, so 
that we get samples of pairs of values (x1, y1), (2, y2),°**, (Xn; Yn). Most applications 
involve one of two kinds of experiments, as follows. 


1. In regression analysis one of the two variables, call it x, can be regarded as an 
ordinary variable because we can measure it without substantial error or we can 
even give it values we want. x is called the independent variable, or sometimes 
the controlled variable because we can control it (set it at values we choose). The 
other variable, Y, is a random variable, and we are interested in the dependence of 
Y on x. Typical examples are the dependence of the blood pressure Y on the age x 
of a person or, as we shall now say, the regression of Y on x, the regression of the 
gain of weight Y of certain animals on the daily ration of food x, the regression of 
the heat conductivity Y of cork on the specific weight x of the cork, etc. 


2. In correlation analysis both quantities are random variables and we are interested 
in relations between them. Examples are the relation (one says “correlation’”) 
between wear X and wear Y of the front tires of cars, between grades X and Y of 
students in mathematics and in physics, respectively, between the hardness X of 
steel plates in the center and the hardness Y near the edges of the plates, etc. 


Regression Analysis 


In regression analysis the dependence of Y on x is a dependence of the mean yw of Y on 
x, so that w~ = p(x) is a function in the ordinary sense. The curve of p(x) is called the 
regression curve of Y on x. 

In this section we discuss the simplest case, namely, that of a straight regression line 


(1) M(x) = Ko + Ky. 


Then we may want to graph the sample values as n points in the xY-plane, fit a straight 
line through them, and use it for estimating p(x) at values of x that interest us, so that we 
know what values of Y we can expect for those x. Fitting that line by eye would not be 
good because it would be subjective; that is, different persons’ results would come out 
differently, particularly if the points are scattered. So we need a mathematical method that 
gives a unique result depending only on the n points. A widely used procedure is the method 
of least squares by Gauss and Legendre. For our task we may formulate it as follows. 


Least Squares Principle 


The straight line should be fitted through the given points so that the sum of the 
squares of the distances of those points from the straight line is minimum, where 
the distance is measured in the vertical direction (the y-direction). (Formulas below.) 
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To get uniqueness of the straight line, we need some extra condition. To see this, take 
the sample (0, 1), (0, —1). Then all the lines y = kyx with any ky satisfy the principle. 
(Can you see it?) The following assumption will imply uniqueness, as we shall find out. 


General Assumption (A1) 


The x-values X4,°++,Xy in our sample (x1, Y1),°**, Xn; Yn) are not all equal. 


From a given sample (4, y1),°**, (Xn; Yn) we shall now determine a straight line by 
least squares. We write the line as 


(2) Dim k 0 ap Uk 1x 
and call it the sample regression line because it will be the counterpart of the population 
regression line (1). 

Now a sample point (x;, yj) has the vertical distance (distance measured in the 


y-direction) from (2) given by 


ly — ko + kaxp)l (see Fig. 543). 


Fig. 543. Vertical distance of a point (x;, y) from a straight line y = kg + k,x 


Hence the sum of the squares of these distances is 


(3) 4 = >.05 — ko — kx)’. 
j=l 


In the method of least squares we now have to determine kg and k such that g is minimum. 
From calculus we know that a necessary condition for this is 


0 
hee 


0 
= 0 and = 
Oky 


(4) ike 


We shall see that from this condition we obtain for the sample regression line the formula 


(5) y—y=kix — x). 
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Here x and y are the means of the x- and the y-values in our sample, that is, 


(@) F= yt + xp) 
6) . 
) F=LOr t+ + yn. 


The slope k, in (5) is called the regression coefficient of the sample and is given by 


Say 


Sy 
Here the “sample covariance” s,,, is 


(8) Sry = —— i >D 63 — HO; - y) = ar y237= - (> «) (3) 
j=l i 


n 


and s2 is given by 


Lr 2 
(9a) 3 = > @j - x = 34-1(Es) |. 


n— 1 


From (5) we see that the sample regression line passes through the point (x, y), by which 
it is determined, together with the regression coefficient (7). We may call s% the variance 
of the x-values, but we should keep in mind that x is an ordinary variable, not a random 
variable. 

We shall soon also need 


2 
P i nuda ||) a 
(7b) Sy ~ n- | Per y= n—- 1 Di 1 Di , 
g=1 j=l J=1 


Derivation of (5) and (7). Differentiating (3) and using (4), we first obtain 


oq 

aE 2 — ko — kaxj) = 0, 
dk 

oq 

— = -25)x,(yj — ko — k1xj) = 0 
aky 


where we sum over j from | to n. We now divide by 2, write each of the two sums as 
three sums, and take the sums containing y; and x;y; over to the right. Then we get the 
“normal equations” 


kon ar ky Sx; = yy 
ko >) xj + ae = Se 


(10) 
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This is a linear system of two equations in the two unknowns kg and ky. Its coefficient 
determinant is [see (9)] 


nD; 

2 

Dy Da 

and is not zero because of Assumption (Al). Hence the system has a unique solution. 
Dividing the first equation of (10) by n and using (6), we get kg = y — kyx. Together 


with y = kg + kyx in (2) this gives (5). To get (7), we solve the system (10) by Cramer’s 
rule (Sec. 7.6) or elimination, finding 


nDXiVj ~ ae >> 


n(n — 1)s2 


2 
= n>\x7 _ (Ss) = n(n — 1s? = n> (x; _ x)" 


(1) ky = 


This gives (7)-(9) and completes the derivation. [The equality of the two expressions in 
(8) and in (9) may be shown by the student]. a 


Regression Line 


The decrease of volume y [%] of leather for certain fixed values of high pressure x [atmospheres] was measured. 
The results are shown in the first two columns of Table 25.11. Find the regression line of y on x. 


Solution. We see that n = 4 and obtain the values x = 28000/4 = 7000, y = 19.0/4 = 4.75, and from (9) 
and (8) 


Table 25.11 Regression of the Decrease of Volume y [%] 
of Leather on the Pressure x [Atmospheres] 


Given Values Auxiliary Values 
Xj V5 xj X5V5 
4000 23 16,000,000 9200 
6000 4.1 36,000,000 24,600 
8000 5.7 64,000,000 45,600 
10,000 6.9 100,000,000 69,000 
28,000 19.0 216,000,000 148,400 


28,0002, 20,000,000 
‘rae 
28,000-19\ 15,400 
4 ) 3 


5-1 
sz = 5 (216,000,000 


1 
Sy = 5 (148.400 


Hence ky = 15,400/20,000,000 = 0.00077 from (7), and the regression line is 
y — 4.75 = 0.00077(« — 7000) or y = 0.00077x — 0.64. 


Note that y(0) = —0.64, which is physically meaningless, but typically indicates that a linear relation is merely 
an approximation valid on some restricted interval. ia] 
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Confidence Intervals in Regression Analysis 


If we want to get confidence intervals, we have to make assumptions about the distribution 
of Y (which we have not made so far; least squares is a “geometric principle,” nowhere 
involving probabilities!). We assume normality and independence in sampling: 


Assumption (A2) 


For each fixed x the random variable Y is normal with mean (1), that is, 

(12) w(x) = ko + Kix 

and variance 0 independent of x. 

Assumption (A3) 

The n performances of the experiment by which we obtain a sample 
(x1,Y1), (2, ya), “1, (Xn, Yn) 


are independent. 


kK in (12) is called the regression coefficient of the population because it can be shown 
that, under Assumptions (A1)-(A3), the maximum likelihood estimate of k 1 is the sample 
regression coefficient ky given by (11). 

Under Assumptions (A1)-(A3), we may now obtain a confidence interval for k1, as 
shown in Table 25.12. 


Table 25.12 Determination of a Confidence Interval for i, in (1) under Assumptions (A1)—(A3) 


Step I. Choose a confidence level y(95%, 99%, or the like). 


Step 2. Determine the solution c of the equation 
(13) Fo) =30 + 9) 


from the table of the f-distribution with n — 2 degrees of freedom (Table A9 in 
App. 5; n = sample size). 


Step 3. Using a sample (x1, y1),°*-, (Wn, Yn), compute (n — 1)s2 from (9a), (n — 1)szy 
from (8), ky from (7), 
n I n 2 
(14) @= isp = > =; & ») 
j=l j=l 
[as in (9b)], and 


(15) Go = (n — 1)(s2 — kZs2). 


Step 4. Compute 
q 
K=c J : a4 
(n — 2)(n — I)sz 


The confidence interval is 
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Confidence Interval for the Regression Coefficient 
Using the sample in Table 25.11, determine a confidence interval for x; by the method in Table 25.12. 
Solution. Step 1. We choose y = 0.95. 


Step 2. Equation (13) takes the form F(c) = 0.975, and Table A9 in App. 5 with n — 2 = 2 degrees of freedom 
gives c = 4.30. 


Step 3. From Example 1 we have 3s2 = 20,000,000 and ky = 0.00077. From Table 25.11 we compute 


3 19? 
352 = 102.0 — — 
4 


11.95. 


go = 11.95 — 20,000,000 « 0.000777 
= 0.092. 
Step 4. We thus obtain 


K = 4.30V0.092/(2 - 20,000,000) 
= 0.000206 


and 


CONF 095 {0.00056 S Kk, = 0.00098}. B 


Correlation Analysis 


We shall now give an introduction to the basic facts in correlation analysis; for proofs see 
Ref. [G2] or [G8] in App. 1. 

Correlation analysis is concerned with the relation between X and Y in a two- 
dimensional random variable (X, Y) (Sec. 24.9). A sample consists of n ordered pairs of 
values (X1, y1),°**, &n, Yn), aS before. The interrelation between the x and y values in the 
sample is measured by the sample covariance s,, in (8) or by the sample correlation 
coefficient 


S. 
(17) r=— 


SpSy 


with s, and s, given in (9). Here r has the advantage that it does not change under a 
multiplication of the x and y values by a factor (in going from feet to inches, etc.). 


Sample Correlation Coefficient 


The sample correlation coefficient r satisfies —1 Sr 31. In particular, r= + 1 
if and only if the sample values lie on a straight line. (See Fig. 544.) 


The theoretical counterpart of r is the correlation coefficient p of X and Y, 


Oxy 
OxOy 


(18) p= 
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Fig. 544. Samples with various values of the correlation coefficient r 


where px = E(X), wy = E(Y), of = E((X — px]?), 0% = E(LY — wy]”) (the means 
and variances of the marginal distributions of X and Y; see Sec. 24.9), and oxy is the 
covariance of X and Y given by (see Sec. 24.9) 


(19) oxy = E([X — pxllY — pyl) = EY) — EX)EY). 


The analog of Theorem | is 


THEOREM 2 Correlation Coefficient 


The correlation coefficient p satisfies —1 = p = 1. In particular, p = +1 if and 
only if X and Y are linearly related, that is, Y = yX + 6, X = y*Y + 6*. 


X and Y are called uncorrelated if p = 0. 


THEOREM 3 Independence. Normal Distribution 


(a) Independent X and Y (see Sec. 24.9) are uncorrelated. 


(b) If (X, Y) is normal (see below), then uncorrelated X and Y are 
independent. 


Here the two-dimensional normal distribution can be introduced by taking two independent 
standardized normal random variables X*, Y*, whose joint distribution thus has the density 


1 =(ar*? 22 
20 #(x*, y#) = — e Oty 9/2 
(20) FPF y*) an ° 
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(representing a surface of revolution over the x*y*-plane with a bell-shaped curve as cross 
section) and setting 


Y = py + poyX* + V1I- proyY*. 


This gives the general two-dimensional normal distribution with the density 


1 = 
(21a) fay) = = i 
270x0yV1-—p 


where 


2 
1 Xx —~ PX X— Mx\/y — Py y~ By 
aby mes) = "a a ) 2»( ox )( oy )+( oy ) 


In Theorem 3(b), normality is important, as we can see from the following example. 


2 


Uncorrelated But Dependent Random Variables 


If X assumes —1, 0, 1 with probability 3 and Y= x. then E(X) = 0 and in (3) 


oxy = E(XY) = E(X®) = (-1)3- $+ 09-4 + 13-4 =0, 


so that p = 0 and X and Y are uncorrelated. But they are certainly not independent since they are even functionally 
related. a 


Test for the Correlation Coefficient p 


Table 25.13 shows a test for p in the case of the two-dimensional normal distribution. f is 
an observed value of a random variable that has a f-distribution with n — 2 degrees of 
freedom. This was shown by R. A. Fisher (Biometrika 10 (1915), 507-521). 


Table 25.13 Test of the Hypothesis p = O Against the Alternative p > 0 in the Case 
of the Two-Dimensional Normal Distribution 


Step 1. Choose a significance level a (5%, 1%, or the like). 


Step 2. Determine the solution c of the equation 
PTSc)=1-a 


from the f-distribution (Table A9 in App. 5) with n — 2 degrees of freedom. 
Step 3. Compute r from (17), using a sample (1, y1),°**, ns Yn)- 


( n>) 
t=r ‘ 
1-—r? 


If ¢ Sc, accept the hypothesis. If t > c, reject the hypothesis. 


Step 4. Compute 
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Test for the Correlation Coefficient p 


Test the hypothesis p = 0 (independence of X and Y, because of Theorem 3) against the alternative p > 0, using 
the data in the lower left corner of Fig. 544, where r = 0.6 (manual soldering errors on 10 two-sided circuit 
boards done by 10 workers; x = front, y = back of the boards). 


Solution. We choose a = 5%; thus | — a = 95%. Since n = 10,n — 2 = 8, the table gives c = 1.86. Also, 
t = 0.6V8/0.64 = 2.12 > c. We reject the hypothesis and assert that there is a positive correlation. A worker making 
few (many) errors on the front side also tends to make few (many) errors on the reverse side of the board. | 


1-10} SAMPLE REGRESSION LINE 


Find and graph the sample regression line of y on x and the 
given data as points on the same axes. Show the details of 
your work. 


1. (0, 1.0), (2, 2.1), (4, 2.9), (6, 3.6), (8, 5.2) 

2. (—2, 3.5), (1, 2.6), (3, 1.3), (5, 0.4) 

3. x = Revolutions per minute, y = Power of a Diesel 
engine [hp] 
x 400 500 600 700 750 


y 5800 14,200 = 18,800 


10,300 21,000 


4. x = Deformation of a certain steel [mm], y = Brinell 
hardness [kg/mm?] 


x 6 9 11 13 22 26 28 33 = 35 
y 68 67 65 53 44 40 37 34 32 


5. x = Brinell hardness, y = Tensile strength [in 1000 psi 
(pounds per square inch)] of steel with 0.45% C 
tempered for 1 hour 


x 200 300 400 500 


y 110 150 190 280 


6. Abrasion of quenched and tempered steel S620. 
x = Sliding distance [km], y = Wear volume [mm?} 


x 1.1 3.2 3.4 45 5.6 


y 40 65 120 150 —- 190 


7. Ohm’s law (Sec. 2.9). x = Voltage [V], y = Current 
[A]. Also find the resistance R [Q]. 


x 40 40 80 80 110 110 


y 5.1 4.8 0.0 10.3 13.0 12.7 


1. What is a sample? A population? Why do we sample 
in statistics? 

2. If we have several samples from the same population, 
do they have the same sample distribution function? 
The same mean and variance? 
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8. Hooke’s law (Sec. 2.4). x = Force [lb], y = Extension 
[in] of a spring. Also find the spring modulus. 


x 2 4 6 8 


y 4.1 7.8 12.3 15.8 


9. Thermal conductivity of water. x = Temperature 
[°F], y = Conductivity [Btu/(hr - ft - °F)]. Also find y 
at room temperature 66°F. 


x 32 50 100 150 212 


y 0.337 0.345 0.365 0.380 = 0.395 


10. Stopping distance of a car. x = Speed [mph]. y = 
Stopping distance [ft]. Also find y at 35 mph. 


x 30 40 50 60 


y 160 240 330 435 

11. CAS EXPERIMENT. Moving Data. Take a sample, 
for instance, that in Prob. 4, and investigate and graph 
the effect of changing y-values (a) for small x, (b) for 
large x, (c) in the middle of the sample. 


12-15} CONFIDENCE INTERVALS 


Find a 95% confidence interval for the regression 
coefficient k;, assuming (A2) and (A3) hold and using the 
sample. 


12. In Prob. 2 

13. In Prob. 3 

14. In Prob. 4 

15. x = Humidity of air [%], y = Expansion of gelatin [%], 
x 10 20 30 40 
y 0.8 1.6 2.3 2.8 


CHAPTER -25-REVIEW-QUESTIONS AND PROBLEMS 


3. Can we develop statistical methods without using 
probability theory? Apply the methods without using a 
sample? 

4. What is the idea of the maximum likelihood method? 
Why do we say “likelihood” rather than “probability”? 
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. Couldn’t we make the error of interval estimation zero 


simply by choosing the confidence level 1? 
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18. 


Assuming normality, find a 95% confidence interval for 
the variance from the sample 145.3, 145.1, 145.4, 146.2. 


6. What is testing? Why do we test? What are the errors 19. Using a sample of 10 values with mean 14.5 from 
involved? a normal population with variance 07 = 0.25, test 
7. When did we use the f-distribution? The F-distribution? the hypothesis fo = 15.0 against the alternative 


15. 


16. 


17. 


. What is the chi-square (x?) test? Give a sample 


example from memory. 


. What are one-sided and two-sided tests? Give typical 


examples. 


. How do we test in quality control? In acceptance 


sampling? 


. What is the power of a test? What could you perhaps 


do when it is low? 


. What is Gauss’s least squares principle (which he found 


at age 18)? 


. What is the difference between regression and 


correlation? 


. Find the mean, variance, and standard derivation of the 


sample 21.0 21.6 19.9 19.6 15.6 20.6 22.1 22.2. 
Assuming normality, find the maximum likelihood 
estimates of mean and variance from the sample in 
Prob. 14. 

Determine a 95% confidence interval for the mean pw 
of a normal population with variance o? = 25, using 
a sample of size 500 with mean 22. 

Determine a 99% confidence interval for the mean of 
a normal population, using the sample 32, 33, 32, 34, 
35, 29, 29, 27. 


SUMMARY—-OF CHAPTER 25 
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(1) 


20. 


21. 


22. 


23. 


24. 


25. 


We recall from Chap. 24 that, with an experiment in which we observe some quantity 
(number of defectives, height of persons, etc.), there is associated a random variable 
X whose probability distribution is given by a distribution function 


F(x) = PX Sx) 


which for each x gives the probability that X assumes any value not exceeding x. 
In statistics we take random samples x1,°--,x, of size n by performing that 


fy = 14.5 on the 5% level. Find the power. 

Three specimens of high-quality concrete had 
compressive strength 357, 359, 413 [kg/cm?], and for 
three specimens of ordinary concrete the values were 
346, 358, 302. Test for equality of the population means, 
1 = Pe, against the alternative wy > fy. Assume 
normality and equality of variance. Choose a = 5%. 
Assume the thickness X of washers to be normal with 
mean 2.75 mm and variance 0.00024 mm”. Set up 
a control chart for 2 and graph the means of the five 
samples (2.74, 2.76), (2.74, 2.74), (2.79, 2.81), (2.78, 
2.76), (2.71, 2.75) on the chart. 

The OC curve in acceptance sampling cannot have a 
strictly vertical portion. Why? 

Find the risks in the sampling plan with n = 6 and 
c = 0, assuming that the AQL is 69 = 1% and the 
RQL is 6; = 15%. How do the risks change if we 
increase n? 

Does a process of producing plastic rods of length 
j = 2 meters need adjustment if in a sample, 2 rods 
have the exact length and 15 are shorter and 3 longer 
than 2 meters? (Use the sign test.) 

Find the regression line of y on x for the data 
(x, y) = (0, 4), (2, 0), (4, —5), (6, —9), (8, —10). 


(Sec. 24.5) 


experiment n times (Sec. 25.1) and draw conclusions from properties of samples 
about properties of the distribution of the corresponding X. We do this by calculating 
point estimates or confidence intervals or by performing a test for parameters (u 
and go” in the normal distribution, p in the binomial distribution, etc.) or by a test 
for distribution functions. 
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A point estimate (Sec. 25.2) is an approximate value for a parameter in the 
distribution of X obtained from a sample. Notably, the sample mean (Sec. 25.1) 


ce 1 
(2) Ro a oi tt ee) 
j=l 


is an estimate of the mean yw of X, and the sample variance (Sec. 25.1) 


1 
3) x? = D (ej — BP = [ee — 2)? + + On — 3)" 


is an estimate of the variance o” of X. Point estimation can be done by the basic 
maximum likelihood method (Sec. 25.2). 

Confidence intervals (Sec. 25.3) are intervals 6; = 0 = 65 with endpoints 
calculated from a sample such that, with a high probability y, we obtain an interval 
that contains the unknown true value of the parameter 6 in the distribution of X. 
Here, y is chosen at the beginning, usually 95% or 99%. We denote such an interval 
by CONF, {61 = 0 & 09}. 


In a test for a parameter we test a hypothesis 0 = 09 against an alternative 0 = 04 
and then, on the basis of a sample, accept the hypothesis, or we reject it in favor of 
the alternative (Sec. 25.4). Like any conclusion about X from samples, this may 
involve errors leading to a false decision. There is a small probability a (which we 
can choose, 5% or 1%, for instance) that we reject a true hypothesis, and there is a 
probability 6 (which we can compute and decrease by taking larger samples) that 
we accept a false hypothesis. @ is called the significance level and 1 — 6 the power 
of the test. Among many other engineering applications, testing is used in quality 
control (Sec. 25.5) and acceptance sampling (Sec. 25.6). 

If not merely a parameter but the kind of distribution of X is unknown, we can 
use the chi-square test (Sec. 25.7) for testing the hypothesis that some function 
F(x) is the unknown distribution function of X. This is done by determining the 
discrepancy between F(x) and the distribution function F(x) of a given sample. 

“Distribution-free” or nonparametric tests are tests that apply to any distribution, 
since they are based on combinatorial ideas. These tests are usually very simple. 
Two of them are discussed in Sec. 25.8. 


The last section deals with samples of pairs of values, which arise in an experiment 
when we simultaneously observe two quantities. In regression analysis, one of the 
quantities, x, is an ordinary variable and the other, Y, is a random variable whose 
mean mw depends on x, say, w(x) = Kg + K,x. In correlation analysis the relation 
between X and Y in a two-dimensional random variable (X, Y) is investigated, 
notably in terms of the correlation coefficient p. 


———s 


References 


: NOS 


Software see at the beginning of Chaps. 19 
and 24. 


General References 

[GenRefl] Abramowitz, M. and I. A. Stegun (eds.), 
Handbook of Mathematical Functions. 10th printing, 
with corrections. Washington, DC: National Bureau 
of Standards. 1972 (also New York: Dover, 1965). See 
also [W1] 

[GenRef2] Cajori, F., History of Mathematics. 5th ed. 
Reprinted. Providence, RI: American Mathematical 
Society, 2002. 

[GenRef3] Courant, R. and D. Hilbert, Methods of 
Mathematical Physics. 2 vols. Hoboken, NJ: Wiley, 
1989. 

[GenRef4] Courant, R., Differential and _ Integral 
Calculus. 2 vols. Hoboken, NJ: Wiley, 1988. 

[GenRef5] Graham, R. L. et al., Concrete Mathematics. 
2nd ed. Reading, MA: Addison-Wesley, 1994. 

[GenRef6] Ito, K. (ed.), Encyclopedic Dictionary of 
Mathematics. 4 vols. 2nd ed. Cambridge, MA: MIT 
Press, 1993. 

[GenRef7] Kreyszig, E., Introductory Functional 
Analysis with Applications. New York: Wiley, 1989. 
[GenRef8] Kreyszig, E., Differential Geometry. Mineola, 

NY: Dover, 1991. 

[GenRef9] Kreyszig, E. Introduction to Differential 
Geometry and Riemannian Geometry. Toronto: 
University of Toronto Press, 1975. 

[GenRef10] Szegé, G., Orthogonal Polynomials. 4th ed. 
Reprinted. New York: American Mathematical Society, 
2003. 

[GenRefl1] Thomas, G. et al., Thomas’ Calculus, Early 
Transcendentals Update. 10th ed. Reading, MA: 
Addison-Wesley, 2003. 


Part A. Ordinary Differential Equations 
(ODEs) (Chaps. 1-6) 
See also Part E: Numeric Analysis 


[Al] Arnold, V. L, Ordinary Differential Equations. 3rd 
ed. New York: Springer, 2006. 

[A2] Bhatia, N. P. and G. P. Szego, Stability Theory of 
Dynamical Systems. New York: Springer, 2002. 

[A3] Birkhoff, G. and G.-C. Rota, Ordinary Differential 
Equations. 4th ed. New York: Wiley, 1989. 


APPENDIX 1 


[A4] Brauer, F. and J. A. Nohel, Qualitative Theory of 
Ordinary Differential Equations. Mineola, NY: Dover, 
1994, 

[A5] Churchill, R. V., Operational Mathematics. 3rd ed. 
New York: McGraw-Hill, 1972. 

[A6] Coddington, E. A. and R. Carlson, Linear Ordinary 
Differential Equations. Philadelphia: SIAM, 1997. 

[A7] Coddington, E. A. and N. Levinson, Theory of 
Ordinary Differential Equations. Malabar, FL: Krieger, 
1984. 

[A8] Dong, T.-R. et al., Qualitative Theory of Differential 
Equations. Providence, RI: American Mathematical 
Society, 1992. 

[A9] Erdélyi, A. et al., Tables of Integral Transforms. 
2 vols. New York: McGraw-Hill, 1954. 

[A10] Hartman, P., Ordinary Differential Equations. 2nd 
ed. Philadelphia: SIAM, 2002. 

[All] Ince, E. L., Ordinary Differential Equations. New 
York: Dover, 1956. 

[A12] Schiff, J. L., The Laplace Transform: Theory and 
Applications. New York: Springer, 1999. 

[A13] Watson, G. N., A Treatise on the Theory of Bessel 
Functions. 2nd ed. Reprinted. New York: Cambridge 
University Press, 1995. 

[A14] Widder, D. V., The Laplace Transform. Princeton, 
NJ: Princeton University Press, 1941. 

[A15] Zwillinger, D., Handbook of Differential Equations. 
3rd ed. New York: Academic Press, 1998. 


Part B. Linear Algebra, Vector Calculus 
(Chaps. 7-10) 

For books on numeric linear algebra, see also 
Part E: Numeric Analysis. 


[B1] Bellman, R., Introduction to Matrix Analysis. 2nd 
ed. Philadelphia: SIAM, 1997. 

[B2] Chatelin, F., Eigenvalues of Matrices. New York: 
Wiley-Interscience, 1993. 

[B3] Gantmacher, F. R., The Theory of Matrices. 2 vols. 
Providence, RI: American Mathematical Society, 2000. 

[B4] Gohberg, I. P. et al., Invariant Subspaces of Matrices 
with Applications. New York: Wiley, 2006. 

[B5] Greub, W. H., Linear Algebra. 4th ed. New York: 
Springer, 1975. 

[B6] Herstein, I. N., Abstract Algebra. 3rd ed. New York: 
Wiley, 1996. 


Al 


A2 APP. 1 References 


{[B7] Joshi, A. W., Matrices and Tensors in Physics. 3rd 
ed. New York: Wiley, 1995. 

[B8] Lang, S., Linear Algebra. 3rd ed. New York: 
Springer, 1996. 

[B9] Nef, W., Linear Algebra. 2nd ed. New York: Dover, 
1988. 

[B10] Parlett, B., The Symmetric Eigenvalue Problem. 

Philadelphia: SIAM, 1998. 


Part C. Fourier Analysis and PDEs 

(Chaps. 11-12) 

For books on numerics for PDEs see also Part 
E: Numeric Analysis. 


[C1] Antimirov, M. Ya., Applied Integral Transforms. 
Providence, RI: American Mathematical Society, 1993. 

[C2] Bracewell, R., The Fourier Transform and Its 
Applications. 3rd ed. New York: McGraw-Hill, 2000. 

[C3] Carslaw, H. S. and J. C. Jaeger, Conduction of Heat 
in Solids. 2nd ed. Reprinted. Oxford: Clarendon, 2000. 

[C4] Churchill, R. V. and J. W. Brown, Fourier Series 
and Boundary Value Problems. 6th ed. New York: 
McGraw-Hill, 2006. 

[C5] DuChateau, P. and D. Zachmann, Applied Partial 
Differential Equations. Mineola, NY: Dover, 2002. 

[C6] Hanna, J. R. and J. H. Rowland, Fourier Series, 
Transforms, and Boundary Value Problems. 2nd ed. 
New York: Wiley, 2008. 

[C7] Jerri, A. J., The Gibbs Phenomenon in Fourier 
Analysis, Splines, and Wavelet Approximations. Boston: 
Kluwer, 1998. 

[C8] John, F., Partial Differential Equations. 4th edition 
New York: Springer, 1982. 

[C9] Tolstov, G. P., Fourier Series. New York: Dover, 1976. 

[C10] Widder, D. V., The Heat Equation. New York: 
Academic Press, 1975. 

[C11] Zauderer, E., Partial Differential Equations of 
Applied Mathematics. 3rd ed. New York: Wiley, 2006. 

[C12] Zygmund, A. and R. Fefferman, Trigonometric Series. 
3rd ed. New York: Cambridge University Press, 2002. 


Part D. Complex Analysis (Chaps. 13-18) 

[D1] Ahlfors, L. V., Complex Analysis. 3rd ed. New 
York: McGraw-Hill, 1979. 

[D2] Bieberbach, L., Conformal Mapping. Providence, 
RI: American Mathematical Society, 2000. 

[D3] Henrici, P., Applied and Computational Complex 
Analysis. 3 vols. New York: Wiley, 1993. 

[D4] Hille, E., Analytic Function Theory. 2 vols. 2nd ed. 
Providence, RI: American Mathematical Society, 
Reprint V1 1983, V2 2005. 

[D5] Knopp, K., Elements of the Theory of Functions. 
New York: Dover, 1952. 


[D6] Knopp, K., Theory of Functions. 2 parts. New York: 
Dover, Reprinted 1996. 

[D7] Krantz, S. G., Complex Analysis: The Geometric 
Viewpoint. Washington, DC: The Mathematical 
Association of America, 1990. 

[D8] Lang, S., Complex Analysis. 4th ed. New York: 
Springer, 1999. 

[D9] Narasimhan, R., Compact Riemann Surfaces. New 
York: Springer, 1996. 

[D10] Nehari, Z., Conformal Mapping. Mineola, NY: 
Dover, 1975. 

{[D11] Springer, G., Introduction to Riemann Surfaces. 
Providence, RI: American Mathematical Society, 2001. 


Part E. Numeric Analysis (Chaps. 19-21) 

[E1] Ames, W. F., Numerical Methods for Partial 
Differential Equations. 3rd ed. New York: Academic 
Press, 1992. 

[E2] Anderson, E., et al., LAPACK User’s Guide. 3rd ed. 
Philadelphia: SIAM, 1999. 

[E3] Bank, R. E.. PLTMG. A Software Package for 
Solving Elliptic Partial Differential Equations: Users’ 
Guide 8.0. Philadelphia: SIAM, 1998. 

[E4] Constanda, C., Solution Techniques for Elementary 
Partial Differential Equations. Boca Raton, FL: CRC 
Press, 2002. 

[ES] Dahlquist, G. and A. Bjorck, Numerical Methods. 
Mineola, NY: Dover, 2003. 

[E6] DeBoor, C., A Practical Guide to Splines. Reprinted. 
New York: Springer, 2001. 

[E7] Dongarra, J. J. et al., LINPACK Users Guide. 
Philadelphia: SIAM, 1979. (See also at the beginning of 
Chap. 19.) 

[E8] Garbow, B. S. et al., Matrix Eigensystem Routines: 
EISPACK Guide Extension. Reprinted. New York: 
Springer, 1990. 

[E9] Golub, G. H. and C. F. Van Loan, Matrix 
Computations. 3rd ed. Baltimore, MD: Johns Hopkins 
University Press, 1996. 

[E10] Higham, N. J., Accuracy and Stability of Numerical 
Algorithms. 2nd ed. Philadelphia: SIAM, 2002. 

{E11] IMSL (International Mathematical and Statistical 
Libraries), FORTRAN Numerical Library. Houston, TX: 
Visual Numerics, 2002. (See also at the beginning of 
Chap. 19.) 

{[E12] IMSL, IMSL for Java. Houston, TX: Visual 
Numerics, 2002. 

[E13] IMSL, C Library. Houston, TX: Visual Numerics, 
2002. 

[E14] Kelley, C. T., Iterative Methods for Linear and 
Nonlinear Equations. Philadelphia: SIAM, 1995. 

{E15] Knabner, P. and L. Angerman, Numerical Methods for 
Partial Differential Equations. New Y ork: Springer, 2003. 


APP. 1 References 


[E16] Knuth, D. E., The Art of Computer Programming. 
3 vols. 3rd ed. Reading, MA: Addison-Wesley, 1997— 
2009. 

[E17] Kreyszig, E., Introductory Functional Analysis with 
Applications. New York: Wiley, 1989. 

[E18] Kreyszig, E., On methods of Fourier analysis in 
multigrid theory. Lecture Notes in Pure and Applied 
Mathematics 157. New York: Dekker, 1994, pp. 225-242. 

[E19] Kreyszig, E., Basic ideas in modern numerical 
analysis and their origins. Proceedings of the Annual 
Conference of the Canadian Society for the History and 
Philosophy of Mathematics. 1997, pp. 34-45. 

[E20] Kreyszig, E., and J. Todd, QR in two dimensions. 
Elemente der Mathematik 31 (1976), pp. 109-114. 
[E21] Mortensen, M. E., Geometric Modeling. 2nd ed. 

New York: Wiley, 1997. 

[E22] Morton, K. W., and D. F. Mayers, Numerical Solution 
of Partial Differential Equations: An Introduction. New 
York: Cambridge University Press, 1994. 

[E23] Ortega, J. M., Introduction to Parallel and Vector 
Solution of Linear Systems. New York: Plenum Press, 
1988. 

[E24] Overton, M. L., Numerical Computing with IEEE 
Floating Point Arithmetic. Philadelphia: SIAM, 2004. 

[E25] Press, W. H. et al., Numerical Recipes in C: The Art 
of Scientific Computing. 2nd ed. New York: Cambridge 
University Press, 1992. 

[E26] Shampine, L. F., Numerical Solutions of Ordinary 
Differential Equations. New York: Chapman and Hall, 
1994. 

[E27] Varga, R. S., Matrix Iterative Analysis. 2nd ed. New 
York: Springer, 2000. 

[E28] Varga, R. S., Gersgorin and His Circles. New York: 
Springer, 2004. 

[E29] Wilkinson, J. H., The Algebraic Eigenvalue 
Problem. Oxford: Oxford University Press, 1988. 


Part F. Optimization, Graphs (Chaps. 22-23) 
[Fl] Bondy, J. A. and U.S.R. Murty, Graph Theory with 
Applications. Hoboken, NJ: Wiley-Interscience, 1991. 
[F2] Cook, W. J. et al., Combinatorial Optimization. New 
York: Wiley, 1997. 

[F3] Diestel, R., Graph Theory. 4th ed. New York: 
Springer, 2006. 

[F4] Diwekar, U. M., Introduction to Applied Optimization. 
2nd ed. New York: Springer, 2008. 
[F5] Gass, S. L., Linear Programming. Method and 
Applications. 3rd ed. New York: McGraw-Hill, 1969. 
[F6] Gross, J. T. and J. Yellen (eds.), Handbook of Graph 
Theory and Applications. 2nd ed. Boca Raton, FL: CRC 
Press, 2006. 

[F7] Goodrich, M. T., and R. Tamassia, Algorithm 
Design: Foundations, Analysis, and Internet Examples. 
Hoboken, NJ: Wiley, 2002. 


A3 


[F8] Harary, F., Graph Theory. Reprinted. Reading, MA: 
Addison-Wesley, 2000. 
[F9] Merris, R., Graph Theory. Hoboken, NJ: Wiley- 
Interscience, 2000. 
[F10] Ralston, A., and P. Rabinowitz, A First Course in 
Numerical Analysis. 2nd ed. Mineola, NY: Dover, 2001. 
{Fl1] Thulasiraman, K., and M. N. S. Swamy, Graph 
Theory and Algorithms. New York: Wiley-Interscience, 
1992. 
[F12] Tucker, A., Applied Combinatorics. 
Hoboken, NJ: Wiley, 2007. 


5th ed. 


Part G. Probability and Statistics 
(Chaps. 24-25) 

[G1] American Society for Testing Materials, Manual on 
Presentation of Data and Control Chart Analysis. 7th 
ed. Philadelphia: ASTM, 2002. 

[G2] Anderson, T. W., An Introduction to Multivariate 
Statistical Analysis. 3rd ed. Hoboken, NJ: Wiley, 
2003. 

[G3] Cramér, H., Mathematical Methods of Statistics. 
Reprinted. Princeton, NJ: Princeton University Press, 
1999, 

[G4] Dodge, Y., The Oxford Dictionary of Statistical 
Terms. 6th ed. Oxford: Oxford University Press, 
2006. 

[G5] Gibbons, J. D. and S. Chakraborti, Nonparametric 
Statistical Inference. 4th ed. New York: Dekker, 2003. 

[G6] Grant, E. L. and R. S. Leavenworth, Statistical 
Quality Control. 7th ed. New York: McGraw-Hill, 
1996. 

[G7] IMSL, Fortran Numerical Library. Houston, TX: 
Visual Numerics, 2002. 

[G8] Kreyszig, E., Introductory Mathematical Statistics. 
Principles and Methods. New York: Wiley, 1970. 

[G9] O’Hagan, T. et al., Kendall's Advanced Theory of 
Statistics 3-Volume Set. Kent, U.K.: Hodder Arnold, 
2004. 

[G10] Rohatgi, V. K. and A. K. MD. E. Saleh, An 
Introduction to Probability and Statistics. 2nd ed. 
Hoboken, NJ: Wiley-Interscience, 2001. 


Web References 

[W1] upgraded version of [GenRefl] online at 
http://dlmf.nist.gov/. Hardcopy and CD-Rom: Oliver, 
W. J. et al. (eds.), NIST Handbook of Mathematical 
Functions. Cambridge; New York: Cambridge University 
Press, 2010. 

[W2] O’Connor, J. and E. Robertson, MacTutor History 
of Mathematics Archive. St. Andrews, Scotland: 
University of St. Andrews, School of Mathematics and 
Statistics. Online at http://www-history.mcs.st-andrews. 
ac.uk. (Biographies of mathematicians, etc.). 


Titit 
CHLLEHEE 


APPENDIX 2 


| 
) 


Answers to 
Odd-Numbered Problems 


it 


Problem Set 1.1, page 8 


1 
1. y = —cos 27x +c 3. y = ce” 
7 
1 
5. y = 2e “(sinx — cosx) +c y= ran sinh 5.13x + ¢ 
9. y = 1.65e-* + 0.35 11. y = (t+ de 
13. y = 1/0. + 3e-”) 15. y = Oand y = 1 because y’ = 0 for these y 


17. exp(—1.4- 107") = 5, ft = 107n 2)/1.4 [sec] 
19. Integrate y" = g twice, y'(t) = gt + U9, y'(0) = Uo = 0 (start from rest), then 
y(t) = xgt” + yo, where y(0) = yo = 0 


Problem Set 1.2, page 11 


11. Straight lines parallel to the x-axis 13. y = x 

15. mv' = mg — bu”, v' = 9.8 — v7, v(0) = 10, v’ = 0 gives the limit/9.8 = 3.1 
[meter/sec] 

17. Errors of steps 1, 5, 10: 0.0052, 0.0382, 0.1245, approximately 

19. x5 = 0.0286 (error 0.0093), x49 = 0.2196 (error 0.0189) 


Problem Set 1.3, page 18 


1. If you add a constant later, you may not get a solution. 
Example: y’ = y, Inly| =x +c, y =e"*° = Ge* but not e* + c(withc # 0) 


3. cos” y dy = dx, ay + 4 sin 2y +c=x 
5. y? + 36x? = ¢, ellipses 7. y = x arctan (x7 + ©) 
9 y = x/(c — x) 11. y = 24/x, hyperbola 
13. dy/sin? y = dx/cosh? x, —coty = tanhx +c, c=0, y = —arccot (tanh x) 
15. y? + 4x? = c = 25 17. y = x arctan (x? — 1) 
19. yoe"t = 2yo, e” = 2(1 week), ce?” = 222 weeks), e*” = 24 
21. 69.6% of yo 23. PV = c = const 


25. T = 22 — 17e~°? 3% = 21.9 [°C] when t = 9.68 min 
27.e = 5 k=, Ing, e = 0.01, ft = (in 100)/k = 66 [min] 
29. No. Use Newton’s law of cooling. 
31. y = ax, y' = g(y/x) = a = const, independent of the point (x, y) 
33. AS = 0.15SA¢, dS/db = 0.158, S = Spe®1°? = 10005, 
& = (1/0.15) In 1000 = 7.3 - 277. Eight times. 
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Problem Set 1.4, page 26 


1. 
5. 
9. 
11. 
13. 
15. 


Bxact, 2 2x, x*yp = 6.9 = c/x? 3. Exact, y = arccos (c/cos x) 

Not exact, y = Vx? + cx 7. F=e, e tan y =c 

Exact, u = e** cosy + K(y), uy = —e*siny +k’, k’ = 0. Ans. e* cosy = 1 
F = sinh x, sinh? x cos y=c 

u=e"+ky), uy=ki =-1l+e%, k=—-y+te¥.Ans.e*-—yt+e%=c 
b=k, ax? + 2kxy + ly? =c 


Problem Set 1.5, page 34 


37. 
39. 


ey =ce™— 52 5.y=(x+oe™ 

wy =x%(c + &) 9. y = (x — 2.5/e)e°S* 
-y=2+csinx 13. Separate. y — 2.5 =c cosh* 1.5x 
(V1 + ya)’ + pO + yo) = G1 + py + G2 + pye) =0+0=0 


- (v1 + yo)’ + pO1 + ya) = O1 + pyD + G2 + py) =r+0=r 
. Solution of cyy + pcyy = c(y, + py) = er 
-y = uy*, y’ + py = u' y* + uy*’ + puy* = u'y* + u(y*" + py*) = u'y* +u:0 


=rul =rfy*=relP® y= fel? dx dx + c. Thus, y = uyy, gives (4). We shall 
see that this method extends to higher-order ODEs (Secs. 2.10 and 3.3). 


Ly =1t Be” 

yole: wee? + 10/32 

. dx/dy = 6e% — 2x, x= ce 24 + 2eY 

.T = 240e** + 60, 7(10) = 200, k = —0.0539, t= 102 min 
.y =A-ky, y0)=0, y=AU —e*/k 

.y’ = 175(0.0001 — y/450), (0) = 450 + 0.0004 = 0.18, 


y = 0.135e7 388% + 0.045 = 0.18/2, 

e7 038891 = (9,09 — 0.045)/0.135 = 1/3, 

t = (In 3)/0.3889 = 2.82. Ans. About 3 years 

y =y-y?—-02y, y = 1/1125 — 0.75e~°**), limit 0.8, limit 1 

y’ = By” — Ay = By(y — A/B),A > 0,B > 0. Constant solutions y = 0, 

y =A/B, y' >Oify > A/B (unlimited growth), y' <0if0<y<A/B 
(extinction). y = A/(ce“* +B), y(0) > A/Bifc < 0, yO) < A/Bifc > 0. 


Problem Set 1.6, page 38 


1. 
5. 
7. 
11. 
13. 


15. 


x7/(c2 + 9) + y?/c? —1=0 3. y — cosh(x — c) —c = 0 

y/x =c,y'/x = y/x*, y' = y/x,y' = —x/y, y? + x? =, circles 

oy? — x2 =e 9. y’ = —2xy, y' = 1/(2xy), x = cel 
j= ex 

y’ = —4x/9y. Trajectories y’ = 9¥/4x, y = cx9/* E > 0). 

Sketch or graph these curves. 

U = C,U,dx + uydy=0, y'’ = —uy/uy. Trajectories y' = ug/u,z. Now 

U = C,v,dx + vy dy = 0, y= —v,/Vy. This agrees with the trajectory ODE 
in uw if uy, = vy (equal denominators) and uy = —v, (equal numerators). But these 


are just the Cauchy—Riemann equations. 
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Problem Set 1.7, page 42 


1. y’ = f(x, y) = rQ) 


p(x)y; hence df/dy = —p(x) is continuous and is thus 


bounded in the closed interval |x — xo| S a. 
3. In |x — xol < a; just take b in a = b/K large, namely, b = aK. 
5. R has sides 2a and 2b and center (1, 1) since y(1) = 1. In R, 


f=2y* S2b+ 1? 


K, a=b/K=b/(2(b + 1)?), da/db = 0 gives b = 1, 


and Qo, = b/K = g. Solution by dy/y” = 2 dx, etc., y = 1/3 — 2x). 
7.\1+y"| SK =1+57, a=b/K, da/db=0, b=1, a=35. 
9. No. At a common point (x4, y1) they would both satisfy the “initial condition” 
y(x4) = yi, violating uniqueness. 


Chapter 1 Review Questions and Problems, page 43 


-y=ce 


2a 


13. y = 1/(ce * + 4) 


-y =ce *~ + 0.01 cos 10x + 0.1 sin 10x 
.y =ce *>” + 0.640x — 0.256 
. Sy? — 4x2 = ¢ 

-y = sin(x + 47) 

we” = 1.25, (In 2)/In 1.25 = 3.1, (In 3)/In 1.25 = 4.9 [days] 
.e* = 0.9, 6.6 days. 43.7 days from e“ = 0.5, eX = 0.01 


Problem Set 2.1, page 53 


1. F(x, z,z') =0 


5. y = (cyx + Co) 


1/2 


21. F = x,x%e4 + x2y =c 
25. 3 sinx + 3 sin y =0 


3.y=cye” +c 


7. (dz/dy)z = —z3 sin y, —1/z = —dx/dy = cosy + C1,x = —siny + cyy + ce 
9. yo = x? In x 

13. y(t) = cue + kt + 

17. y = —0.75x9/? — 2.25%? 


Problem Set 2.2, page 59 


35% 
37. 


-y=cye 


-y = (cy + cox)e™™ 
—2.6x fs me 


-y =cye 


~y = (cy + coxye 


—2.5x ff eee 


a 


52/3 


.y” + 2V5y' + Sy =0 
-y = 4.6 cos 5x — 0.24 sin 5x 


-y=2e” 
1 

. a 

eS War 


0.278 cin (Wa x) 


11. y = cye** + ce 
15. y = 3 cos 2.5x — sin 2.5x 
19. y = 15e-* — sinx 


-2.8 3.2 
3.y = ce" + coe 


ly =e + ee°7" 

Ley = qe" tae" 

15. y = e 9?" (4 cos (Wax) + Bsin (V7 x) 
19, ay 4 sy = 0 

23. y = 6e2" + 4e7 3” 

27. y = (4.5 — xje™” 


31. Independent 


. yx? + cox” In x = 0 with x = | gives cy = 0; then co = 0 for x = 2, say. 
Hence independent 


ye, 


x 
> 


yo = 0.001le” + e~” 


Dependent since sin 2x = 2 sin x cos x 


App. 2. Answers to Odd-Numbered Problems A7 


Problem Set 2.3, page 61 


1. 4e?*, —e~* + 8c", —cosx — 2sinx 
3.0, 0, (D— 21)(—4e~7”) = 8e—™* + 8e—™* 
5.0,- Se", 0 
7,.2D—-1\(2D+D, y = ce? + coe? 
9. (D — 2.11D7, y = (c1 + coxje?” 
11. (D — 1.61)(D — 2.41), y = cye!®* + coe? 
15. Combine the two conditions to get L(cy + kw) = L(cy) + L(kw) = cLy + kLw. 
The converse is simple. 


Problem Set 2.4, page 69 


1. y’ = Yo COS Wot + (Vo/ Wo) sin wot. At integer ¢ (if @g = 77), because of periodicity. 
3. (i) Lower by a factor V2, (ii) higher by V2 

5. 0.3183, 0.4775, V(ky + ke)/m/(277) = 0.5738 

7 


. mL0" = —mg sin 0 ~ —mgé (tangential component of W = mg), 
6” + wo20 = 0, wo/(277) = Ve/L/(277) 
9. my” = —ayy, where m = 1kg, ay = 7 - 0.01? - 2y meter® is the volume of the 


water that causes the restoring force ayy with y = 9800 nt (= weight/ meter®). 

y” + wory = 0, woz = ay/m = ay = 0.000628y. Frequency wo/27r = 0.4 [sec™*]. 
13. y = [yo + (Wo + ayo) tle, y =[1 + Wo t+ Dele: 

Gi) v9 = —2, 3, -3, 3 —5 
15. w* = [wz — c?/(4m?)]/? = wo — c2/mk)]/? ~ wo — c?/8mk) = 2.9583 
17. The positive solutions of tan t = 1, that is, 7/4 (max), 57r/4 (min). etc 
19. 0.0231 = (In 2)/30 [kg/sec] from exp (—10 + 3c/2m) = 5. 


Problem Set 2.5, page 73 


3. y = (cy + cglnx)x 18 5. Vx (cy cos (In x) + cg sin (In x)) 
Tey = cyx7 + cox? 9. y = (cy + coln x)x°8 
11. y = x°(cy cos (V6 Inx) + cg sin (V6 In x) 

B.y=5 " 15. y = (3.6 + 4.0 Inx)/x 


17. y = cos (In x) + sin (In x) 19. y = —0.525x° + 0.625x 3 


Problem Set 2.6, page 79 


3. W = —2.2e~** 5.W = —x* 7,.W=a 

9. y" + 25y =0, W=5, y =3cos 5x — sin 5x 

ll. y" + 5y + 6.34=0, W=0.3e %, 3e >? cos 0.3x 

13. y" + 2y'=0, W=-2e72%, y=05(1 +e 7) 

15. y" — 3.24y =0, W=1.8, y = 14.2 cosh 1.8x + 9.1 sinh 1.8x 


Problem Set 2.7, page 84 


lL. y = cye* + coe ™ — 5c 3. y = cye 7 + coe * + 6x2 — 18x + 21 
5.y=(c, + cox)e + 5e *sinx Ty = pe 8 ee PS se" + 6x — 16 
9. y = cye*™ + coe ™ + 1.2x0e — 26% 

11. y = cos(V3x) + 6x2 — 4 


A8& 


App. 2. Answers to Odd-Numbered Problems 


13. y = gr =D ze” + e™ 15. y = Inx 
17. y = e~° (1.5 cos 0.5x — sin 0.5x) + 2e°™ 


Problem Set 2.8, page 91 


3. Yp = 1.0625 cos 2t + 3.1875 sin 2t 

5. Vp = —1.28 cos 4.52 + 0.36 sin 4.52 

7. Vp = 25 + cos 3¢ + sin 3¢ 

.y= e+°A cost + Bsint) + 0.8 cost + 0.4 sint 

11. y = Acos V2t + Bsin V2t + t(sin V2t — cos V21)/(2V2) 

13. y = Acost + Bsint — (cos wt)/ (wo — 1) 

15. y = e~7*(A cos 2t + B sin 2f) + 4 sin 2t 

17. y = 5 sint — ys sin 3t — 795 sin 5t 

19. y = e "(0.4 cost + 0.8 sin) + e /(—-0.4 cos st + 0.8 sin 31) 

25. CAS Experiment. The choice of w needs experimentation, inspection of the curves 
obtained, and then changes on a trail-and-error basis. It is interesting to see how in 
the case of beats the period gets increasingly longer and the maximum amplitude gets 
increasingly larger as w/(27r) approaches the resonance frequency. 


Problem Set 2.9, page 98 


Le eiic=0, Pacer"? 

3. LI! + RI=E, I= (E/R) + ce PU" = 48 + ce 

5. I = 2(cos t — cos 201)/399 

7. Ig is maximum when S = 0; thus, C = 1/(w?L). 

9.7=0 11. 7 = 5.5 cos 10f + 16.5 sin 10r A 
13. 1 = e~ (A cos 10t + B sin 101) — 400 cos 25¢ + 200 sin 25t A 
15. R > Reit = 2V L/C is Case I, ete. 
17. E(0) = 600, 1’(0) = 600, J = e~3*(—100 cos 4t + 75 sin 4t) + 100 cos t 
19.R=20, L=1H, C=%F, E=44sin10rV 


Problem Set 2.10, page 102 


1. y = Acos 3x + Bsin 3x + 3 (cos 3x) In |cos 3x| + ax sin 3x 


3.y =cyx t+ Cox — xsinx 5.y =Acosx + Bsinx + 3x(cos x + sin x) 
T.y = (cy + Cox) er + x —2e%% 9. y = (cy + cox)e” + Ax 7/2 @ 
ll.y = Cx? + cox? + 1/(2x*) 13. y = cx? + cox? + 3x° 


Chapter 2 Review Questions and Problems, page 102 


Lyeaer” +c" 9. y = e 3"(A cos 5x + B sin 5x) 
11. y = (cy + cox)e?®* 13. y = cyx* + cox? 
15. y = cye™™ + coe 7? — 3x + x® AT y = (cy + conde + 0.25271 
19. y = 5cos 4x — 3sin 4x + e* 21. y = —4x + 2x3 + 1/x 


23. 1 = —0.01093 cos 415t¢ + 0.05273 sin 415t A 


App. 2. Answers to Odd-Numbered Problems A9 


25. 1 = 73 (50 sin 4t — 110 cos 4) A 
27. RLC-circuit with R = 200,L =4H,C = 0.1 F, E = —25 cos 4tV 
29. w = 3.1 is close to wy = Vk/m = 3, y = 25(cos 3t — cos 3.1f). 


Problem Set 3.1, page 111 


9. Linearly independent 11. Linearly independent 
13. Linearly independent 15. Linearly dependent 


Problem Set 3.2, page 116 


1. y = cy + co cos 5x + cg sin 5x 3. y = cy + cox + cgcos 2x + cq sin 2x 
5. y = Ayjcosx + By, sinx + Agcos 3x + Bo sin 3x 
7. y = 2.398 + e~ 1®*(1.002 cos 1.5x — 1.998 sin 1.5x) 
9. y = 4e~* + 5e~*/? cos 3x 11. y = cosh 5x — cos 4x 
13. y = e° 2 + 4.3¢-°" + 12.1 cos 0.1x — 0.6 sin 0.1x 


Problem Set 3.3, page 122 


1. y = (cy + cox + cgx%e7* + fe™ — x +2 
3. y = cy, cos x + cosinx + cgcos 3x + cq sin 3x + 0.1 sinh 2x 


5.y= cyx? + cox + cax 4 _ ix? 
7. y = (cy + cox + c9x%e?” — F(cos 3x — sin 3x) 


9. y = cosx + 3 sin 4x 11. y = e 3*(—1.4 cos x — sin x) 
13. y = 2 — 2sinx + cosx 


Chapter 3 Review Questions and Problems, page 122 


T.y = cy + e (A cos 3x + B sin 3x) 

9. y = c, cosh 2x + cg sinh 2x + cg. cos 2x + cq sin 2x + cosh x 
11. y = (cy + cox + exe Pe 13. y = (cy + cox + c3x2e~ 7 + x7 — 3x + 3 
15. y = cyx + cox? + c5x3/? — 40 17. y = 2e7?* cos 4x + 0.05 x — 0.06 
19. y = 4e7* + 5e7* 


Problem Set 4.1, page 136 


1. Yes 
5. yy = 0.02(—y1 + ya), yo = 0.02(y1 — 2y2 + ys), y3 = 0.02(y2 — y3) 
7.cy = 1, cg = —5 9. cy = 10, Cg =5 


Iyi=ye, ya=yit +Pye, yeall 4)"ef + etl —a]'e 4 
13. yi) = yo, yo = 24y, — 2ya, yi = cye" + ee =y, yo=y’ 
15. (a) For example, C = 1000 gives —2.39993, —0.000167. (b) —2.4, 0. 

(d) dong = —4 + 2V6.4 = 1.05964 gives the critical case. C about 0.18506. 


Al0 App. 2. Answers to Odd-Numbered Problems 


Problem Set 4.3, page 147 


ly, = ge + ee", yo = —3c,e"7# + coe! 
3. y= 2cye7" + 29, yo = ae” — C2 
5. y= 5c + Jee 

14.5t 


yg = —2c, + S5cee 
7.1 = —C2cos V2t + cg sin V2t + cy 
yg = coV2 sin V2t + cgV2 cos V2t 
yg = C2COS V2t — cgsin V2t + cy 
9. yy = gcye 18 + 2cge** — cge18" 
yo = cye 18" + eye + cge1%* 
yg = ce 18t — Ioe% — 1 oe l8t 
11. y; = —20e" + 8e7*/? 
yo = 4e" — Ag t? 
13. y, = 2sinht, yo = 2cosht 


15. y, = ze! 
as Lt 
V2.2 26 


17.y2=yi ty, yo=y¥1 tyr = yi — ye = —y1- Git yD, 
yt + 2y, + 2y, = 0, yy =e “(Acost+ Bsind), 
yo=yty= e (Bcost — Asin ?). Note that r? = yi + ye = e742 + B), 


19. 1, = cye* + 3c9e7%", In = —3ce7* — coe ** 


Problem Set 4.4, page 151 


1. Unstable improper node, yy = cye’, ye = coe 
3. Center, always stable, yj = A cos 3t + Bsin3t, yo = 3B cos 3t — 3A sin 3t 
5. Stable spiral, yy = e~ "(A cos 2t + Bsin 2A), yo = e 7"(B cos 2t — A sin 21) 
7. Saddle point, always unstable, yy = ce’ + coe, yo = —cye' + coe 
9. Unstable node, yy = cye™ + coe”, yg = 2ce _ 2c9e7" 

11. y =e‘ (Acost + B sin). Stable and attractive spirals 

15. p = 0.2 # 0 (was 0), A<O, spiral point, unstable. 

17. For instance, (a) —2, (b) —1, (ce) = 3, (d) =1, (e) 4. 


Problem Set 4.5, page 159 


5. Center at (0, 0). At (2, 0) set y) = 2 + Jy. Then ¥4 = Jz. Saddle point at (2, 0). 
7. (0,0), yi = —y1 + ya, yo = —y1 — ye, stable and attractive spiral point; (—2, 2), 
y= —-2+51, yo=24+ Fe, J = —¥1 — 392, Fo = —V1 — Jo, saddle point 
9. (0, 0) saddle point, (—3, 0) and (3, 0) centers 
11. (67 + 2n7r, 0) saddle points; (-aa + 2n7r, 0) centers. 
Use —cos (+37 + 91) = sin (+91) ~ +Yy. 
13. (+2nz7r, 0) centers; yy = (2n + 1) + 34, (@ + 2n7T, 0) saddle points 
15. By multiplication, yeys = (4y, — y?)y}. By integration, 
ya = Ay? - aye + ck = 3(C +4 yti(c 4+ yi), where c* = im — 8. 


Problem Set 4.6, page 163 


3. y, =cye’ + cge", yo = —cye! + coet — e! 


5. y1 = cye™’ + coe** — 0.43t — 0.24, yo = cye™* — 2cge™* + 1.12¢ + 0.53 


App. 2. Answers to Odd-Numbered Problems All 


7. y1 = cye’ + Acge** — 3t — 4 — 2e~*, yo = —cye’ — Scoe! + St +75 +e* 

9. The formula for v shows that these various choices differ by multiples of the eigen- 
vector for A = —2, which can be absorbed into, or taken out of, cy in the general 
solution ae 

11. yy = -8 cosh t — 3 sinh t + He , y= -8 sinh t — 3 cosh t + sez 
13. y; = cos 2t + sin2t+ 4 cost, yo = 2cos 2t — 2sin2t+ sint 
15. y, = 4e7* — 4e* + &* yo = —4e°° +1 
17. I, = 2cye*® + 2cge*2* + 100, 
In = (1.1 + VO4l)ce** + (1.1 — VO41) coe”, 
Ay = —0.9+ V041, Az = —0.9 — V0.41 
19. cy = 17.948, co = —67.948 


2t 


Chapter 4 Review Questions and Problems, page 164 


11. yy = cye*® + coe, yo = 2cye** — 2coe~*". Saddle point 


13. y) =e *(Acost + Bsinft), yo = ze “[(B — 2A) cost — (A + 2B) sin f]; 
asymptotically stable spiral point 

15. y, = ce’ + coe, yg = cye >” — coe". Stable node 

17. yy = e ‘(A cos 2t + B sin 2A), yg = e ‘(Bos 2t — A sin 2r). Stable and attractive 
spiral point 

19. Unstable spiral point 

21. yy = cye 8 + coe — 1 -— 817, yo = —cye* + coe** — 4t 

23. yy = 2cye~* + 2cge** + cost — sint, yo = —cye! + coe* 

25. 11 + 2.5, — Ip) = 169 sint, 2.5, — 11) + 25ln = 0, 
I, = (19 + 32.5\e~>* — 19 cost + 62.5 sin t, 
In = (—6 — 32.51)e~** + 6 cost + 2.5 sint 

27. (0, 0) saddle point; (—1, 0), (1, 0) centers 

29. (n7r, 0) center when n is even and saddle point when n is odd 


am 


Problem Set 5.1, page 174 


3. Vik| 
5. V3/2 


aye 
7. y = ag(1 — x? + x4/2! — x8/3! + — ---) = age 
9. y = dy + ayx — $agx* — §ayx? + “++ = agcos x + ay sinx 
1.4 1.5 12,13,1 4 1.5 
11. ag(1 — jax" — ox ree) Faye + gx" + gx" + gax” — aqx ney 


Bal = 50" ae Pek FO Ae Ee Hae Peet 
EO Dg SOD a 
won @tDe+1 ~~ om — 3)! 


= 2_ 5,38, 2,4411,5 1) _ 923 
Ws=1l+x-x ex + 3x° t+ agx°, s(s) = 768 


19.5 =4 — x? ax? ax, s(2) = —8. but x = 2 is too large to give good 
values. Exact: y = (x — 2)e* 


Problem Set 5.2, page 179 


5. Po(x) = ye(231x® — 315x* + 105x2 — 5), 
Po(x) = 75(429x! — 693x° + 315x? — 35x) 


Al2 


Ap 


11 
15 


p. 2. Answers to Odd-Numbered Problems 


~Setx = az. y = cyP,(x/a) + coQn(x/a) 
PL=V1—x7, Ph=3xV1—27, P} = 301 — x), 


PZ = (1 — x?)(105x? — 15)/2 


Problem Set 5.3, page 186 


17. 


19 


~,-2% 42 4 _ sinx _ il x x 4 _ COS X 
oe 3! 5! a a oe x 
b= 1, Gat, 77=0 ».oe™.. y= ine 


4 


1.5, 1.6 
30x + yaax ee 


-yp=Hilt 5x? ax? + ax 


yo = x + gx? — igx* + gx? — apx° + - 
. ypVx, yg=1+x 
yr =e", yo =e" /x 
-y17 =e, yo=e*Inx 
.y = AF(, 1, —3;x) + Bx? FG, 3, 3: x) 
y =A(l — 8x + 32x?) + Bx?/4FG -2,55%) 
-y = €1F(2, 2, —33 t — 2) + colt — 2) FG, —3, 3 t — 2) 


Problem Set 5.4, page 195 


3. cyJo( Vx) 

5. cy (Ax) + Co (Ax), v #0, £1, £2, --- 

7. cyJ1/a(3x) + coJ—1/2(3x) = x7 /(G sin 3x + C2 cos $x) 
9.x (cy, (x) + CoJ_ (x), v #0, +1, +2, °°: 

13. Jn(x4) = Jn(xg) = 0 implies x7 "J,(%1) = xg “Jyn(x2) = 0 and 
[x ” or = 0 somewhere between x1 and xg by Rolle’s theorem. 

Now use (21b) to get J, +4(x) = 0 there. Conversely, Jy, +41(%3) = Jn+1(%4) = 0, 
thus x2 77J,, 44(%3) = x2 *1,,44(x4) = 0 implies J,,(x) = 0 in between by Rolle’s 
theorem and (2la) withy =n + 1. 

15. By Rolle, Jo = 0 at least once between two zeros of Jo. Use Jo = —J; by (21b) 
with v = 0. Together J; = 0 at least once between two zeros of Jp. Also use 
(xJ,)' = xJo by (21a) with v = | and Rolle. 

19. Use (21b) with v = 0, (21a) with v = 1, (21d) with v = 2, respectively. 

21. Integrate (21a). 

23. Use (21a) with v = 1, partial integration, (21b) with v = 0, partial integration. 

25. Use (21d) to get 


Pr 


1 
3 
5 


[sean = —2J4(x) + | scoax = —2J4(x) — 2Jo(x) + [ncoax 


= —2J4(x) — 2Jo(x) — Jo(x) + c. 


oblem Set 5.5, page 200 


- CyJa(x) + c2¥a(x) 
: cyJo/3(x7) a c2¥o/3(x") 
» CJo(Vx) + c2¥o(Vx) 


App. 2. Answers to Odd-Numbered Problems Al3 


7. Vx (cr1jag kx) + co¥%ja(gkx?)) 
9, x°(cyJ3(x) + co¥3(x)) 
11. Set H = kH™ and use (10). 
13. Use (20) in Sec. 5.4. 


Chapter 5 Review Questions and Problems, page 200 


11. cos 2x, sin 2x 

13. (x — 1)~°, (x — 1)’; Euler-Cauchy with x — 1 instead of x 
15. J), (x), Js) 

17%e"7,1 +x 

19. Vx Jy(V0), Vx (Vx) 


Problem Set 6.1, page 210 


1. 3/s” + 12/s 3. s/(s? + 1?) 
5. 1/((s — 2)? — 1) 7. (w cos @ + s sin 0)/(s? + w?) 
SS _ ,aos —bs 
9, t if - 1 1. 1 : be 
Ss s Ss Ss 
(l= P al -s —s 
13. —____— 15. e 1 ia 1 
" 2s? 2s Ss 


19. Use e“ = cosh at + sinh at. 


00 


23. Set ct = p. Then & f(ct)) = | e“'f(ct) dt = | e “/©P Fp) dp/c = F(s/c)/c. 


0 10) 

25. 0.2. cos 1.8 + sin 1.8 59 cee 
bi L 
29. 243 — 1.91° 31. a : 3 ) = 4e7* — 3¢7% 
s—2 st] 
agp 2 ae 0.5 + 2a 
(s + 33 (s + 4.5)2 + 4ar? 

37. mte~™* 39, 318e-tV2 
41. e~°” sinh at 43. e"(2 cos 3t + 3 sin 32) 


45. (ko + kiHe ™ 


Problem Set 6.2, page 216 


1. y = 1.25e7>" — 1.25 cos 2t + 3.25 sin 2t 
3. (s — 3\(s + 2) = Ils + 28 — 11 = lly + 17, Y= 10/(s — 3) + 1/(s + 2), 
y = 10e3! + et 
5. (s? — 4)Y = 12s, y = 12 coshst 
Ty =ge% + 80% +he°% 9.y =e — ec + 24 
11. (s + 1.5)°Y = 5 + 31.5 + 3 + 54/s* + 64/s, 
Y = I/(s + 1.5) + 1/(s + 1.5)? + 24/s* — 32/s? + 32/s?, 
y=(14+ deh! + 443 — 16r7 + 32¢ 
13.t=f-1, Y=4/~-6), J =4e%, y= 4e8ttP 


Al4 App. 2. Answers to Odd-Numbered Problems 


15.t=7 +15, (s—1)(s+4)¥ =459 + 17+ 6/(s — 2), y = 3e 19 + 2-1) 


2 
17. — 19. ie 
(s + a) s(s* + 40) 
21. £(f') = L(sinh 24) = s£(f) — 1. Answer: (s? — 2)/(s? — 4s) 
23. 1201 — e 7/4) 25. (1 — cos wf)/w 
27, 4(1 + t — cos 3 — Asin 34) 29, Lent +4 
a 


Problem Set 6.3, page 223 
3. £((t — 2)u(t — 2)) = e7 75/5? 


5. (e(1 us +) = : : ; qd - e 78/24 77/2) 


1 = = 
7 (e 2st) __ e A(s+ 7)» 


“s+7 
9 
9. emn(2 res *) 
9? s? S 
11. (se~79/? + e~78)/(s? + 1) 13. 2[1 + u(t — 7)] sin 3¢ 
15. (t — 3)3u(t — 3)/6 17. e* cos t (0 < t < 277) 
19. 3(e" — 1)8e7™ 21. sin 3¢ + sint (0 <1 < 77);$ sin 3t(t > 77) 


23. ef — sint(0<t<2m), e! —$sin2r(t > 27) 


25.t— sint(0<t< 1), cos(t—1)+sin(t — 1) — sint(t> 1) 

Wt=14+f, vy" +4y = 81 +7) — ut — 4)), cos 2t + 2t? — Lift <5, 
cos 2t + 49 cos (2t — 10) + 10 sin (2t — 10) if t > 5 

29. 0.11’ + 251 = 490e~™"[1 — u(t — 1)], 
= 20(e7>* _ e7 2501) + 20u(t — pie + e7 2508 + 2454 

31. Rq' + q/C=0, Q=L£q, gO) =CH, i=4q', 
R(sQ — CV) + O/C = 0, g = Che" 1% 

100, 10 1 1 


100, _ 100 _95 —2s 
33. 101 + —~I= ae , l=e < s+ 10 


Pago ite So 
35. i = (10 sin 10t + 100 sin A(u(t — 7) — u(t — 377)) 
37. (0.587 + 20)I = 78s(1_+ e~7%)/(s? + 1), 
i= 4cost — 4cos V/40t — 4u(t — 7)[cos t + cos (V40(t — 77))] 


) i= Oift < 2 and 


t 
39. i' + 21+ 2| i(r) dr = 1000(1 — u(t — 2)), I = 1000(1 — e7?5)/(s? + 2s + 2), 
0 


i = 1000e~* sin t — 1000u(t — 2)e~**? sin (¢ — 2) 


Problem Set 6.4, page 230 


3. y = 8 cos 2t + 5 u(t — 47) sin 2t 

5.sint(Q0<t<7); O(7 <t< 277); —sint(t > 277) 

Ty =e ' + 4e~* sin dt + Su(t — de 3° sin (At — 4) 

9. y = 0.1[e’ + e~*4(—cos t + 7 sin A] + O.1u(t — 10)[-e7* + 
e +30 (cos (t — 10) — 7 sin (t — 10))] 


App. 2. Answers to Odd-Numbered Problems Al5 


dspam e +e + Gat 0 = 30 + ey s 
u(t — Pe aaa a e 8t-2)) 
15. ke P*/(s — se~P*) (s > 0) 


Problem Set 6.5, page 237 


lt 3. (ef — e@')/2 = sinht 
5. $f sin wt Te’ -t-1 
9y-1*y=1, y=e 11. y = cost 


t 
13. y(t) + 2 | e'—7 y(t) dr = te’, y =sinht 
0 
17. et — ett 19. ¢ sin Tt 
21. (wt — sin wt)/w 23. 4.5(cosh 3t — 1) 
25. 1.5t sin 6t 


Problem Set 6.6, page 241 


2 s* — wo 
; (s + 3)" (s? + w*?* 
As? + 24 (382 — 71) 
. = ; 9. 2 23 
(s“ — 4) (s° + a7) 
4s” — a” 


11 


_ lf Hass 
- G+ ie 15. F(s) = -3(5 7 -) , f@ = gt sinh 3t 
17. Ins —In(s— 1); (-1 + e/t 
19. [In (s? + 1) — 2In(s — 1)]’ = 2s/(s? + 1) — 2/8 — 1); 2(—-cost + e/t 


Problem Set 6.7, page 246 
3. y, = —e7 + 4e™*, yo =e + 307" 
5. yy = —cost + sint + 1 + uit — 1)[-1 + cos(t — 1) — sin(@t — 1)] 
yo = cost + sint — 1 + u(t — 1)[1 — cos(t — 1) — sin(t — 1)] 
Tey, = —e7* + de® + Bult — 1)(—-e?-™* + 4), 


y= —eW 7 4 eb + zUu(t = 1)(-e3- 4 + e’) 
9. y, = B+ 4ne*, yo = (1 — 4he%* 
11,4, =e +e", ya =e” 
13. y, = —4e' + sin 10t + 4 cost, yo = 4e’ — sin 10t + 4 cost 


t -t t - 
15. yy =e’, yo=e'*", yg =e —e 


19. 4i, + 8(i, — in) + 2i/ = 390 cost, Big + 8(ig — iy) + 45 = 0, 
iz = —26e~** — 16e~* + 42 cost + 15 sint, 
in = —26e77* + 8e~-8! + 18 cost + 12 sint 


t 


Chapter 6 Review Questions and Problems, page 251 


i 13.4(1 — cos mt), 77?/Qs® + 277s) 
s2—4 s2—] 


15. e~ 38*3/2/(5 — 2) 17. Sec. 6.6; 2s?/(s? + 1)? 


Al6 


App. 2. Answers to Odd-Numbered Problems 


19, 12/(s*(s + 3)) 21. tu(t — 1) 
23. sin (wt + 0) 25. 317 + £3 
27. e 73 cost — 2 sin?) 
31. ¢e7' + u(t — m)[1.2 cost — 3.6sint + 2e'*7 — 0.8e7!-27] 
33.00 St=2), 1-2e°%? + e249 (> 2) 
35. y, = 4e’ — , yo = e! — e 2# 
37. yy = cost — u(t — 7) sint + 2u(t — 277) sin? ¥1, 
yo = —sint — 2u(t — 1) cos" st + u(t — 277) sint 
39. yy = C/V10) sin V102, ye = —C/V/10) sin V10t 
41.1-—etO0<t<4), (e*#-De* ¢>4) 
43. i(t) = e~ "(8 cos 3t — 3g sin 3t) — secos 10¢ + Asin 10¢ 
45. 5iz + 20(i1 — in) = 60, 3015 + 20(i5 — iz) + 2iyg = 0, 


iy = —Be7** + Se" + 3, ig = —4e77* + 4e7 08" 


Problem Set 7.1, page 261 


29. y = e (13 cost + 11 sinf) + 10t — 8 


3.3 x3, 3x4, 3x6, 2x2, 2x3, 3x2 
5.B=3A, iA 
7.No, no, yes, no, no 
0 6 12 0 2.5 1 0 8.5 13 
9.18 15 15}, 2:5 1.5 2), | 20.5 16.5 17 |, undefined 
3 0 =9 =] 2 = 2 2 —10 
0 26 5.4 0.6 
11.| 34 32}, same, | —4.2 2.4], same 
28 —10 —0.6 0.6 
70 28 
13.| —28 56], same, —D, undefined 
14. O 
5.5 —4,5 
15. 33.0], same, undefined, undefined 17.| —27.0 
—11.0 9.0 


Problem Set 7.2, page 270 
5. 10, n(n + 1)/2 


App. 2. Answers to Odd-Numbered Problems Al7 
10 —-14 —6 10 —-5 —I15 
11.| —5 7 —121], same, —14 —33], same 
=) = —4 —2 —4 —4 
1 2 0 =-9 =-5 
—9 3 4 
13.;2 13 —6}], 3. —1], undefined, 
-5 -l 0 
0 -6 4 4 0 
8 
15. Undefined, | -—4], [7 —l 3], same 
=3 
—30 —18 22 
17. 45 91, undefined, 41, undefined 
5 =7 —12 
10.5 7 
19. Undefined, QO |, | —-3], same 
=3 1 
25. (d) AB = (AB)' = B'A' = BA; etc. 
(e) Answer. If AB = —BA. 
29. p = [85 62 30)’, v = [44,920 30,940] 
Problem Set 7.3, page 280 
lx=-2, y=05 3.x y=3, z 5 
5.x =6, y=-7 7.x = —-3t, y=tarb, z= 2t 
9.x =3t-—1, y=-t+4, z=tarb. 
11. w = i X=, arb., yr 2to —, zZ=Te arb. 
i3.w=4, x=0, y=2, 2z=6 17. =2, L=6, = 8 
19. ET — (Ry + Ro)Eo/(R1R2) A, Ip = Eo/Ri A, Iz = Eo/R2A 
21. x5 = 1600 — x1, x3 = 600+ 4x4, x4 = 1000 — x1. No 
23. C:3x1 — x3 = 0, H: 8x1 — 2x4 = 0, O: 2x9 — 2x3 — x4 = 0, thus 
CsHg + 509 — 3COg + 4H2O 
Problem Set 7.4, page 287 
1.1; [2 -1 3]; [2 -1)" 3.3; {[3 5 O], [0 3 5], [0 O 1)} 


1 
5.3; {[2 —-I1 4], [0 1 —46], [0 0 
[ 


1}; {2 0 IJ), 


[0 3 23], 


Al8& 


App. 2. Answers to Odd-Numbered Problems 


7.2; [8 0 4 O], [0 2 0 4]; [8 0 4], [0 2 OJ] 
9.3; [9 0 1 O], [0 9 8 9], [0 0 1 QO] 


11. (c) 1 17. No 
19. Yes 21. No 
23. Yes 25. Yes 
27.2, [-2 0 1], [0 2 1] 

29. No 31. No 


33. 1, solution of the given system c[1 = 3], basis [1 = 3] 
35.1, [14 2 $ 


Problem Set 7.7, page 300 


7. cos (a + B) 9.1 
11. 40 13. 289 
15. —64 17. 2 
19. 2 21.x = 3.5, y=—-1.0 
23.x=0, y=4, z 1 25.w=3, x=0, y=2, z 2 


Problem Set 7.8, page 308 


1.20 4.64 
1, 3 2 02 -02 
0.50 3.60 
-30 -05 2 
1 oO 0 
5.) —2 1 0 7.4 ° SA 
a a 
0 o ¢ 
3.760 22.272 
9/2 0 0 11. (A*) 1 = (AP = 
2.400 15.280 
0 a 0 


15.AA7} =I, (AA4) 1! = (A474) 141 =F Multiply by A from the right. 
Problem Set 7.9, page 318 
1.01 ol, fo 1; O on, fo -11'; f 1)", [-1 177 


3.1, [1 11 -7" 5. No 


=] 


1 0 
7. Dimension 2, basis xe~*, e” 9. 3; basis : 
0 0 0 
V1. xy =5y, — ya, X2 = 3y1 — ya 
13. x1 = 2y, — 3ya, xg = —10yy + lOy2 + yz, x3 = —7Tyy + Ilye + yg 


App. 2. Answers to Odd-Numbered Problems 


15. 
19. 
23. 
25. 


-l1 6 1 1 i 2B 
11.|-18 8 -7], |-6 -8 2 
-13 -2 -7 -1 7 #7 
13.(21 -8 -31]', [21 -8 31] 
15.197, 0 
17. —5, det A? = (det A)? = 25, 0 
—2 -12 -12 
19.}-12 16 —-9 1L.4¢=4,.95 32, 2=8 
-12 -9 -14 
23.x=6, y=2t+2, z=tarb. 25.x=04, y=—-13, z=17 
27.x=10, y=-2 29. Ranks 2, 2, 
31. Ranks 2, 2, 1 33.1,= 165A, I2=11A, Ip=5.5A 
35. [1 = 4A, In = SA, Iz =1A 
Problem Set 8.1, page 329 
1.3, [1 oJ'; -0.6,[0 1° 3.-4,[2 9]'; 3, 1° 
5.—-3i,[1 —i]; 34,1 i,i= V—-1 
iL =0,. [i ol" 
9.0.8 + 0.6i,[1 —'; 08-0.6i,[1 a! 
11. —(A? — 18d? + 99d — 162)/(A — 3) = —(A? — 15A + 54); 3,[2 -2 1]'; 
6,[1 2 2]'; 9,[2 1 —2]' 
13. —(A — 9)?; 9,[2 —2 1)", defect 2 
15. (A + 12,2 + 2A - 15); -1,[1 0 0 oj", [0 0 oy'; 
=o) =) WD) i3 =) 1 =i 
0 -1 
17. . Eigenvalues i, —i. Corresponding eigenvectors are complex, 
1 0 
indicating that no direction is preserved under a rotation. 
0 0 0 1 
19. sil. ; 0, . A point onto the x9-axis goes onto itself, 
0 1 1 0 


23. 


V26 17. V5 

1 21.k = —20 
a=[3 1 —4]', b=[-4 

a=(5 3 2. b=B 2 


a point on the x,-axis onto the origin. 


8 —1]", |la+ bil = V107 5.099 +9 
—1)', 90+ 14 = 2(38 + 14) 


Use that real entries imply real coefficients of the characteristic polynomial. 


Al9 


A20 


App. 2. Answers to Odd-Numbered Problems 


Problem Set 8.2, page 333 


3 
5 
715 - 
9 


.1.5,[1 —1]', —45°; 4.5,[1 1)", 45° 
1,.1=1/V6 1)",1122°; 8:[1 1/6j",.22.2° 
.0.5,{1 —1]'; 1.5,[1 1]'; directions —45° and 45° 


eft 12° 16j" 

, Ls 

~c[10 18 25]" 

.x = (I — A) 1y = [0.6747 0.7128 0.7543]" 


17. AX; = AjXj (x; # 0), (A kI)x; = A;X; kx; = (A; k)Xx;. 


. From Ax; = A;x; (x; # 0) and Prob. 18 follows kpA?x; = kpAPx; and 
k gA?x; = kqA#x; (p 2 0, g 2 0, integer). Adding on both sides, we see that 
k pA” + kgA® has the eigenvalue k»AP + kgA#. From this the statement follows. 


Problem Set 8.3, page 338 


1.0.8 + 0.6i,[1 +7’; orthogonal 
-2 + 0.87, [1 +i]. Not skew-symmetric! 


.1,(0 2 1]'; 61 0 Oj', [0 1 —2]'; symmetric 


3 

5 

7.0, +25i, skew-symmetric 

9.1,(0 1 oF; «gf1 0 q's; -Z0. O 
15. No 17.47! 


-iy', orthogonal 


19. No since det A = det (A') = det (—A) = ( 


Problem Set 8.4, page 345 


3.008 —0.544 17 =) 
3. , 4, ; 6, ; 
15.456 6,992 31 ll 


1)3det (A) = —det (A) = 0. 


oak Eb Gb TE [! 


0 -5 15 1 0 
[= <2) fr =o 5 0 
9. A = 
le Og 2 1 0 0 
—2 1 t 1 2 0 
11. A = 
| 3 1 3 2 0 -5 


=] 3 0 1 
1}; x=/O], | 1], |-1 
1 1 0 1 


App. 2. Answers to Odd-Numbered Problems 


A21 


1 0 0 1 0 0 4 0 0 
13.}-2 1 OJA/2 1 Of={0 -2 0 
1-2 1 3 2 1 OG  @ 1 
zs; 3 & 1 -2 O 10 oO O 
15.|-3; § glAll 1 -1/=/|0 1.  O 
0 - ¢ 1 1 1 oY @ 4 
[7 3 i 
17,C = , 4yi + 10y3 = 200, x = —— y, ellipse 
13.7 2/-1 1 
[2 a9 ; it 4 
19.C = : 14y3 _ 8y3 =0, x =—— y; pair of straight lines 
| 11 3 2/1 -1 
| 1 -6 , f-l 1 
21.C = . Tyt—5y3 = 70, x =—= y, hyperbola 
|-6 1 a) f 4 
[-11 42 ie 
23. C = , S52y} — 39y3 = 156, x = —— y, hyperbola 
42 24 13/3 -2 
Problem Set 8.5, page 351 
1. Hermitian, 5, [-i 1], 7, [i 1" 
3. Unitary, (1 — iV3)/2, [-1 1; G+iV3)/2, [1 1" 
5. Skew-Hermitian, unitary, —i, [0 -1 1]', 4 [1 0 oj, fo 1 1° 
7. Eigenvalues —1, 1; eigenvectors [1 —1)", [1 1; {1 —i', [1 il"; 


(0 1)", [1 OJ", resp. 
9. Hermitian, 16 
.(ABC)' = C'B'A' =C7(-B)A 


if and only if HS = SH. 


11. Skew-Hermitian, —6i 


.A=H+S, H=4(A+A), S=4(A—A') HH Hermitian, S skew-Hermitian) 
.AA' — A'A = (H + S)(H — S) — (H — S)(H + S) = 2(—-HS + SH) = 0 


Chapter 8 Review Questions and Problems, page 352 


11.3,(1 1; 20 -1)" 

19,5.41. Se Fak 

15.0,(2 —2 1"; 9,[-1+3i 1+ 33 
i| 2 =ali2e 2 

17.-1,1; A=— = 
16-3 5]//30 1 


4y; =94[=1< 3% 1=3) ay" 


A22 


App. 2. Answers to Odd-Numbered Problems 


i} 2 -1] [2 1 -09 0 
19. — A = 
3}-1 2 i 2 0 0.6 
1 1 -l 1 2 1 4 0 0 
i 1. =] O|A 1 -l 1};=|0 —20 0) 
0 1 1 —1 1 2 0) 0 22. 
[4 12 ;/2 1 
23. C = ; 10y7 _ 20y3 = 20, x =— = y, hyperbola 
[ 12 —-14 5}1 -2 
[3.7 1.6 : ; c\2 7 
25. C = » 45yz + O.5yg = 45, x = — y, ellipse 
[1.6 13 V5{1 -2 
Problem Set 9.1, page 360 
1.5,1,0; V26; [5/V26, 1/V26, 0] 
3. 8.5, —4.0, 1.7; V91.14, [0.890, —0.419, 0.178] 
5.2,1,-2; u= (3, es 21, position vector of Q 
7. 0: (4,0,4), |v] = V1625 9. 0:(0,0, -8), |v] =8 
11. [6,4,0), 1,01, 1-3, =2,0) 13. [1, 5, 8] 
15. 7[9, —7, 8] = [63, —49, 56] 17. [12, 8, 0] 
21. [4, 9, —3], V 106 23. [0, 0, 5], 5 
25. [6, 2, —14] = 2u, V236 27. p = [0, 0, —5] 
29. v = [U1, Vo, 3], U1, Ve arbitrary 31. k = 10 
33. |p+q+ul| S18. Nothing 
35. vg — va = [-19, 0] — [22/V2, 22/V2] = [-19 — 22/V2, -22/V2] 
37.u+v+ p= ([-k,0] + [/,/] + [0, -1000] = 0, -k+/+0=0, 


0+/-— 1000 =0, / = 1000, k = 1000 


Problem Set 9.2, page 367 


.44, 44, 0 3. V35, V320, V86 

. |[2, 9, 9]| = V166 = 12.88 < V80 + V86 = 18.22 

.|-24| = 24, lalle| = V35V86 = V3010 = 54.86; cf. (6) 

. 300; cf. (5a) and (5b) 13. Use (1) and |cos y| S 1. 


.|a+ bl? + |a— bl? =aea+ 2aeb+ beb+t (aca— 2aeb + beb) 


= 2lal? + 2|b|? 


- [2, 5, 0] » [2, 2, 2] = 14 
. [0, 4, 3] * [-3, —2, 1] = —S is negative! Why? 
. Yes, because W = (p+ qg)°ed=ped+qed. 23. arccos 0.5976 = 53.3° 
. B — ais the angle between the unit vectors a and b. Use (2). 
. y = arccos (12/(6V/13)) = 0.9828 = 56.3° and 123.7° 
28 


4 =-% a3: 215.5) 
.(a + b)*(a — b) = [al? — |b/? = 0, [al = |b]. A square. 
.0. Why? 


Me Gig lal Bs |b| or if a and b are orthogonal. 


App. 2. Answers to Odd-Numbered Problems 
Problem Set 9.3, page 374 
5. —m instead of m, tendency to rotate in the opposite sense. 
7. |v| = |[0, 20, 0] x [8, 6, 0]] = |[0, 0, —160]| = 160 
9. Zero volume in Fig. 191, which can happen in several ways. 
11. [0, 0,7], [0,0,—7], —4 13. [6,2,7], [-6, —2, —7] 
15.0 17. [—32, —58, 34], [—42, —63, 19] 
19.1, —-1 
21. [—48, —72, —168], 12\/248 = 189.0, 189.0 
23.0, 0, 13 
25.m = [—2, —2, 0] X [2, 3, 0] = [0,0, —10], m = 10 clockwise 
27. [6, 2,0] x [1, 2, 0] = [0, 0, 10] 29. 4|[—12, 2, 6]| = V46 
31. 3x + 2y-—z=5 33. 474/6 = 79 
Problem Set 9.4, page 380 
1. Hyperbolas 
3. Parallel straight lines (planes in space) y = 3x ae 
5. Circles, centers on the y-axis 
7. Ellipses 9. Parallel planes 
11. Elliptic cylinders 13. Paraboloids 
Problem Set 9.5, page 390 
1. Circle, center (3, 0), radius 2 3. Cubic parabola x = 0, z = y> 
5. Ellipse 7. Helix 
9. A “Lissajous curve” ll.r = [3 + V13cost,2 + V13 sing, 1] 
13.r = (2+ 4,1 + 21,3] 15. r = [¢, 4¢ — 1, 52] 
17. r = [V2 cos?, sint, sin #] 19. r = [cosh t, (V3/2) sinh t, —2] 
21. Use sin (—a@) = —sina. 
25. u = [—sin ¢, 0, cos f]. At P, r’ = [—8, 0, 6]. q(w) = [6 — 8w, i, 8 + 6w]. 
27. q(w) = [2 + w,4 — aw, 0] 29. Vr’ er’ = cosht,/ = sinh] = 1.175 
31. Vr' er’ =a,1 = at/2 33. Start from r(t) = [t, f(d)]. 
35.v =r’ = [1,220], lv] = V1+ 427, a= [0,2,0] 
37. v(0) = (w + 1) Ri, a(0) = —wRj 
39. v = [—-sint — 2 sin 2t, cos t — 2 cos 2r], lv|? = 5 — 4cos 3t, 


41. 


43. 


45. 


49. 


6 sin 3t 


a = [-cost — 4cos 2t, —sint + 4 sin 2f], and ajgn = =————-V. 
5 — 4cos 3t 


v = [sin ¢, 2 cos 2t, —2 sin 2r], |v|? = 4 + sin? t, 
; 3 sin 2t 
a = [-cos f, —4 sin 2t, —4 cos 2f], and aan = aa 
4+ sin“ t 


1 year = 365 - 86,400 sec, R = 30+ 365 - 86,400/27 = 151 - 10° {km], 

lal = wR = |v|?/R = 5.98 - 107® [km/sec?] 

R = 3960 + 80 mi = 2.133 - 10’ ft, g = lal = w?R = |v|?/R, |v] = VgR = 
V6.61 - 108 = 25,700 [ft/sec] = 17,500 [mph] 

r() = [4 y(), O], rv’ =[ly’, OJ rer’ =1+4 y, etc. 


A23 


A24 


App. 2. Answers to Odd-Numbered Problems 


dr dr /ds d*r dr /(ds \? ar = dr /(ds 3 
51. = ; 7 = ) / ae 3 = 3 i t 
ds dt/ dt ds dt dt ds dt dt 


53. 3/(1 + 917 + 914) 


Problem Set 9.7, page 402 


1. [2y — 1, 2x + 2] 3. [—y/x?, 1/x] 
5. [4x3, 4y3] 7. Use the chain rule. 
9. Apply the quotient rule to each component and collect terms. 

11. [y,x], [5, —4] 

13. [2x/(x2 + y?), 2y/(x? + y?)],  [0.16, 0.12] 


15. [8x, 18y, 2z], [40, —18, —22] 17. For P on the x- and y-axes. 
19, [—1.25, 0] 21. [0, —e] 

23. Points with y = 0, +7, +27,---. 25. —VT7(P) = [0, 4, -1] 

31. Vf = [32x, —2y], Vf(P) = [160, —2] 33. [12x, 4y, 2z], [60, 20, 10] 
35. [—2x, —2y, 1], [-6, —8, 1] 37. (2, 1] * [1, -1]/V5 = 1/V5 
39. [1, 1, 1] * [—3/125, 0, —4/125]/V3 = —7/(125V3) 

41. 8/3 43. f = xyz 


45. f = fuydx + fugdy + fus dz 


Problem Set 9.8, page 405 


1. 2x + 8y + 18z; 7 3. 0, after simplification; solenoidal 
5. 9x"y?z?, 1296 7. —2e* (cos y)z 
9. (b) (fur)a + (flay + (fes)z = fl@px + Way + 3)z] + fev1 + fyve + fev3, ete. 
11. [v3, va, v3] =r’ = [x', y’,z’] = [y, 0,0], 2 =0,z=cg, y’ =0,y = co, and 
zx = y = Co,X = Cot + cy. Hence as ¢ increases from 0 to 1, this “shear flow” 
transforms the cube into a parallelepiped of volume 1. 
13. div (w X r) = 0 because U1, Vg, v3 do not depend on x, y, z, respectively. 
15. —2 cos 2x + 2 cos 2y 17. 0 
19. 2/(x2 + y? + 22)? 


Problem Set 9.9, page 408 


3. Use the definitions and direct calculation. 
5. ea? = 9°), 9G? — 27,207 — 29) 7. e~ [cos y, sin y, 0] 
9. curl v = [—6z, 0, 0] incompressible, v = r’ = [x’, y’,z’] = [0, 32,0], x=c4, 
Z=c3, y’ =3; = 3c%, y= 3c3t + co 
11. curl v = [0, 0, —3], incompressible, x = y, y! = —2x, 2xx’ + yy’ = 0, 
x + dy? =C€,Z= 3 
13. curl v = 0, irrotational, div v = 1, compressible, r = [cye’, coe’, c3e"]. Sketch it. 
15. [—1, —1, —1], same (why?) 
17. —yz — zx — xy, 0 (why?), -y -—z-— x 
19. [—2z — y, —2x — z, —2y — x], same (why?) 


App. 2. Answers to Odd-Numbered Problems A25 


Chapter 9 Review Questions and Problems, page 409 


11. —10, 1080, 1080, 65 
13. [—10, —30, 0], [10, 30,0], 0, 40 

15. [—1260, —1830, —300], [—210, 120, —540], undefined 

17, -125, 125, —125 

19. (70, —40, —50], 0, W352 + 207 + 257 = 2250 

21. [-2, —6, —13] 

23. y, = arccos (—10/-V65 - 40) = 1.7682 = —101.3°, yp = 23.7° 


25. [5, 2, 0] ° [4 — 1,3 — 1,0] = 19 27. v° w/|w| = 22/V8 = 7.78 
29. [0, 0, — 14], tendency of clockwise rotation 31. 4 

33.1, —2y 

35.0, same (why?), Ay? + x7 — xz) 

37. [0, —2, 0] 39, 9/225 =2 


Problem Set 10.1, page 418 


3.4 
5.r=([2cost, 2sinf], OStS7/2; 2 

7. “Exponential helix,” (e%* — 1/3 9. 23.5, 0 

11. 2e7' + 2te*, 9 —-2e72 — e443 15.187, $(477)?, 187 


17.[4cost, +sint, sint, 4cos/], [2,2,0] 19. 14444, 1843.2 


Problem Set 10.2, page 425 


3. singxcos2y, 1 — 1/V2 = 0.293 5,6" sing, ¢ =O 

7. cosh 1 — 2 = —0.457 

9. e* coshy + e* sinhy, e — (cosh1 + sinh 1) = 0 

13. e* cos 2b 15. Dependent, x? # —4y”, etc. 
17. Dependent, 4 + 0, etc. 19. sin (a® + 2b? + c?) 


Problem Set 10.3, page 432 


1 
3. 8y3/3, 54 5. | [x — x38 — x? — x5] de = a 
0 
7. cosh 2x — cosh x, 3 sinh 4 — sinh 2 9.36 + 27y”, 144 
11.2=1-—1r7, dedy =rdrd6, Answer: 7/2 
13. x = 2b/3, y =h/3 15.x=0, y =4r/37 


17.1, = bie /12, 1, = b7h/A 
19. I, = (a + byh?/24, I, = h(a* — b*)/(48(a — b)) 


Problem Set 10.4, page 438 


1.(-1-1)- 7/4 = -77/2 3. e? — 1) — Be? - 1) 
5. 2x — 2y, In — x2) -(Q— x?) +1, x= -1---1, -8 
7. 0. Why? 9, 18 


13. V?w = coshx, y= x/2---2, gcosh4— 3 


A26 App. 2. Answers to Odd-Numbered Problems 


15. V2w = 6xy, 3x(10 — x2)? — 3x, 486 17. V?w = 6x — 6y, — 38.4 
19. | grad w|? = e?*, 3(e* — 1) 


Problem Set 10.5, page 442 
1. Straight lines, k 


3.2 =cVx2 4+ i, circles, straight lines, [—cucosv, —cusinv, ul] 

5.2=x74+ y’, circles, parabolas, [—2u? COS U, —2u? sin v, ul 

Ti ape + yb" + z7/¢7 =1, [bcecos*uvcosu, accos7u sinu, ab sinv cos v], 
ellipses 


11. (a, o, @, +07], N =[-2%, -25, 1] 
13. Set x = u and y = v. 
15.[2+5cosu, —-1+5sinu, v], [5cosu, 5sinu, O] 


17. ae cos c cosu, —2.8+ acosusinu, 3.2 + asinv], a= 1.5; 
[a2 cos” v cos Uu, a” cos” v sin Uu, a” cos v sin v] 
19. [coshu, sinhu, v], [coshu, —sinhu, 0] 


Problem Set 10.6, page 450 


1. F(r) + N = [-uv?, v2, O]*[-3, 2, 1] = 3u2 + 2v?, 29.5 

3. Fir) * N = cos® v cos u sin u from (3), Sec. 10.5. Answer: 3 

5. F(r) *N = —u?, -1287 

7.F*N=[0, sinu, cosv]*[l,—2u,0], 4+ (-2 + 77/16 — 1/2)V2 = —0.1775 

9.r=([2cosu, 2sinu, vl], OSuS7/4, 0Sv S5. Integrate 2 sinh v sin u to 
get 2(1 — 1/V2)(cosh 5 — 1) = 42.885. 

13. 7777/6 = 88.6 

15. G(r) = (1 + 9u4)?/?, |N| = (1 + 9u4)¥/?. Answer: 54.4 


7 aa ||eo — y+ z7]adA 


2ar h 
23.[ucosv, usinv, ul], | | Fry ae te ema 
0 -o V2 
25. [cosucosv, cosusinv, sinu], dA = (cos u) du dv, B the z-axis, Ip = 87/3, 
Iz = Ig + 1" + Aor = 20.9. 


Problem Set 10.7, page 457 


1, 224 

3. —e 1? + oe ¥?, e712 4 6% De 38 — oe 2 — Qe 4:1 

5. (sin 2x) (1 — cos2x), % 2 

7.[rcosucosv, cosusinv, rsinu], dV= r? cos u dr du du, 0 =0, 27a?/3 
9 


. div F = 2x + 2z, 48 11. 12(e — 1/e) = 24 sinh 1 
13. div F = —sin z, 0 15. 1/7 + 34 = 0.5266 
17. ha /2 19. 8abc(b? + c)/3 
21. (a*/4) + 20+ h = ha*a/2 23. h°ar/10 


25. Do Prob. 20 as the last one. 


App. 2. Answers to Odd-Numbered Problems A27 


Problem Set 10.8, page 462 


1.x = 0, y = 0,z = 0, no contributions. «=a: af/dan = df/ax 2x 2a, ete. 
Integrals x = a: (—2a)bc, y=b: (—2b)ac, z=c: (4c) ab. Sum 0 

3. The volume integral of 8y? + [0, 8y] + [2x, 0] = 8y? is 8y3/3 = 8. The surface 
integral of fag/dn = f + 2x = 2f = 8y” over x = 1 is 8y?/3 = &. Others 0. 

5. The volume integral of 6y? + 4 — 2x? + 12 is 0; 8(x = 1), —8(y = 1), others 0. 

.F=([x, 0, OJ], div F = 1, use (2*), Sec. 10.7, etc. 

.z = Oand z= Va? — x? y = Va? r2, dx dy = rdr dé, 
~2ar + Ba? = 7298-2 |% = Para 

ll.r=a, ¢=0, cosp=1, v =§a+ (47a) 


~I 


N=) 


Problem Set 10.9, page 468 


1.8:2=yOSx10Sy4), [0,2z, —-2z]*[0,-1,1], +20 
3. [2e-* cosy, —e*, O]*[0, —-y, 1] =ye™*, +(2-2/Ve) 
5. (0, 2z, 3] *(0,0,1]=3, +2a? 

7.[-e, —e*, —e4%]*[-2x, 0, 1], +(e*-2e4+ 1) 

9. The sides contribute a, 3a?/ 2, —a, 0. 


11. —277; curl F = 0 13. 5k, 8077 
15. [0, —1, 2x — 2y] * [0, 0, 11,3 
17.r = [cosu, sinu, vl], [—3v2, 0, O]e[cosu, sinu, O], —1 
19.r=[ucosv, usinv, uj], OSuSl, OSvSET/2, 
[-e*, 1, O]*[-ucosv, —usinu, ul]. Answer: 1/2 


Chapter 10 Review Questions and Problems, page 469 


l1.r =[4—- 101, 2+ 87, Fr) + dr = [2(4 — 100”, —4(2r + 87] *[-10, 8] dr; 
—4528/3. Or using exactness. 

13. Not exact, curl F = (5 cos x)k, +10 15. 0 since curl F = 0 

17. By Stokes, +1877 19. F = grad (y2 + xz), 27 


21.M=8, x=% y=% 


25.M=4k/15, x=2, Y=q 27. 288(a+b+o)T 
29. div F = 20 + 62”. Answer: 21 31. 24 sinh 1 = 28.205 
33. Direct integration, — 35. 7277 


Problem Set 11.1, page 482 


1. 207, 277, 77, 7, 1, 1,4, 3 5. There is no smallest p > 0. 
4 1 1 . 1... |e 

13. — (cos x + —cos 3x + —cos 5x + -::) +2 (sinx + =sin3x + =sin5x + -::-) 
T 9 25 3 5 


15. $77 + 4 (cos x + ¥ cos 2x + $ cos 3x + a) — Aq (sin x + 3 sin 2x + 
z sin 3x +--+) 


T 4 1 1 
4 + foe 
17 (cos x 9 cos 3x 75 cos 5x ) 


A28 App. 2. Answers to Odd-Numbered Problems 


T 2 1 1 1 
.— a + — +--+) + sinx — —sin 2x + 
19 a (cos. 9 ©98 3x 75 cos 5x ) sin x 7 Sin 2x 
1 
= sin 3x — +--- 
3 ain x 


21. 2(sinx + 3 sin 2x + sin 3x + Zsin 4x + § sin 5x + se) 


Problem Set 11.2, page 490 


1. Neither, even, odd, odd, neither 3. Even 5. Even 


9. Odd, L = 2, * (sin 2 + bin SZ +b sin SB 4...) 
7 2° a Se 


4 
=a, 


11. Even, L = 1, 2 (cos mx — Leos 2mx + Leos 3mx— +---) 
3° °«C07T 4 9 


13. Rectifier, L = Z i = 1a (cos 277x + J cos 67x + F568 107rx + -) + 
2 8 WT 9 25 


1/1 1 1 1 
€ sin 277x 4 sin 477x 6 sin 677x 8 sin 87x ) 


A 


15. Odd, L = 7, (sin isin 3x + se sin Se — +--+) 


- 3 


17. Even, Lb = 1, = + £5 (cos mx + d eustrs i= toe airs + -) 
7 9 25 


i) 


19.2 + cos 2x + % cos 4x 


4/. mx 1. 37x 1. Sax 
Sp oe + + ae es 
23.L=4, (a) 1, (b) (sin a 3 sin 7 5 sin i ) 


9 25 
(b) 2(sin x + 3 sin 2x + % sin 3x + fsin 4x + ---) 
377 2, 


27.L= 77, (a) 8 + 2 (cosx 5 608 2x + 5c08 3x + 55 008 5x 


25. L = 7, (@) Z+A(cosx + deos3e + cos sx ++), 
7 


1 1 1 1 1 
— + —cos 7x + — —— + — fore 
18 cos 6x 49 cos 7x 81 cos 9x 50 cos 10x 1 cos llx ) 


(1 + 2) sinx + Dio + (3 _ 2) sina + a + 
TT 2 3. On 4 

1 2 1 

— + — 4 r+ — 1 a rere 

( 2) sin 5x 6 sin 6x 


29. Rectifier, L = 77, 


2 — 4 ( cose + conte + Ho cos se +), (b) sin x 


Problem Set 11.3, page 494 


3. The output becomes a pure cosine series. 
5. For A, this is similar to Fig. 54 in Sec. 2.8, whereas for the phase shift By, 
the sense is the same for all n. 


App. 2. Answers to Odd-Numbered Problems A29 


7.y = Cy cos wt + Cosinwt + a(w)sint, a(w) = 1/(w 1) = —1.33, 
—5.26, 4.76, 0.8, 0.01. Note the change of sign. 


4 1 
lly=C t + Co sin wt + — int + —5——~ sin 3t + 
y 1 COS W 2 Sin W a (= _ 9 sin we = 49 sin 
sin 5t + -) 
wo” — 121 
N 
13. y = >) (A, cos nt + By sinnt), Ay = [1 — n?)ay — nbycl/Dy, 
n=1 


By = [1 — n)by + ncay]/Dn, Dy = (1 — 7)? + nc? 


15. by, = (—1)""* - 12/n? (n odd), y = >, (Ancos nt + By sin nt), 
n=1 
Ay = (-1)"+ 12nc/n®Dy, Bn = (—1)"*? + 121 — n?)/(n3 Dy) with Dy as in 
Prob. 13. 
17. 1 = 50 + Ay cost + By sint + Agcos 3t + Bg sin 3t + «++, Ay = (10 — n”)ay/Dn, 
By = 10ndy/Dn, dn = —400/(n?27), Dy = (n? — 10)? + 100n? 


= 2400(10 — n?) 
19. 1(t) = > (An cos nt + By sinnt), Ay = (—1)"*" , 
2 
n=1 n Dn 
24,000 
By = (-1)"*? , Dn = (10 — n?)? + 100n? 
nDy, 


Section 11.4, page 498 


T 4 1 1 
, + — +++), E*¥ =0. 
3.F 5 (cos. 9 cos 3x 75 cos 5x ).e 0.0748, 


0.0748, 0.0119, 0.0119, 0.0037 
5. F = “(sins + sin 3x + zsin 5x + 7) E* = 1.1902, 1.1902, 0.6243, 0.6243, 


0.4206 (0.1272 when N = 20) 
7. F = 2[(a? — 6) sinx — § (407? — 6) sin 2x + sy (9m? — 6) sin 3x — +---]; 
E* = 674.8, 454.7, 336.4, 265.6, 219.0. Why is E* so large? 


Section 11.5, page 503 


3. Setx =ct+ k. 5.x = cos 6, dx = —sin 6 dé, etc. 
7.Am = (mrr/10)°, m = 1,2,°++3 ym = sin (m7rx/10) 

9.A = [(Qm + Ir/2D]’, m = 0, 1,°++,¥m = sin (2m + 1)ax/(2L)) 
11. Am, = m2, m = 1,2,°*+, ¥m = x sin (m In |x|) 


—4x 


Bp] e".¢=6.7Se" A, Ht a He sin mx,m = 1, 2,--- 


Section 11.6, page 509 


1. 8(Py(x) — P3(x) + P5(x)) 

3. 5 Pox) — 7 Pax) — 35 Pal) 

9. —0.4775Py(x) — 0.6908P3(x) + 1.844P5(x) — 0.8236P7(x) + 0.1658Po(x) + ---, 
mo = 9. Rounding seems to have considerable influence in Probs. 8-13. 


A30 


App. 2. Answers to Odd-Numbered Problems 

11. 0.7854 Po(x) — 0.3540Po(x) + 0.0830P4y(x) — +++, m9 = 4 

13. 0.1212Po(x) — 0.7955Po(x) + 0.9600P4(x) — 0.3360P6(x) + +--+, mo = 8 
15. (C) dm = (2/J 1(a0,m)) (J1(@0,m)/ Q0,m) = 2/(A0,mI1(A0,m)) 


Section 11.7, page 517 


1. f(x) = mre~*(x > 0) gives A = | e’ cos wu du = : B= ue 
0 


(see Example 3), etc. 


3. Use (11); B = 2 | 5 sin wo dv _ 1~ cos Tw 


Ww 
0 
af 4 
: sin w — wcos w 
5. B(w) = — — rv sin wudu = 5 
T 2 w 
oo . 
7 2 sin w COS xw 
ered w 


0 


2 io} 
9. A(w) = cos Ww’ dv = e-” (w >0) 
7 ‘6 1+v? 


i. + 
n2| cos 7w + | 


3 cos xw dw 
Wd, l1—-—w 
15. For n = 1, 2, 11, 12, 31, 32, 49, 50 the value of Si(n7r) — 7/2 equals 0.28, —0.15, 
0.029, —0.026, 0.0103, —0.0099, 0.0065, —0.0064 (rounded). 
17. ea | 2 Sew sin xw dw 
7 w 
0 
2 [“w— e(wcosw — sin w) 
19. 3 sin xw dw 
ad 1+w 
Section 11.8, page 522 
1. few) = VQ/7) (2 sin w — sin 2w)/w 
3. fe(w) = V(2/77) (cos 2w + 2w sin 2w — 1)/w? 
A 2 (w? — 2) sin w + 2w cos w 
5. fo(w) = 3 
7 w 
7. Yes. No 9. 2/7 w/(a2 + w?) 
11. V2/7 ((2 — w?) cos w + 2w sin w — 2)/w? 
1 2 1 2 1 2 2 w 
13. ¥,(e~”) = —| -F(e”) 4 -1l)j= . + = 
settee) 1 ele) \ 7 ) 1(/2 w2 +1 /2) Varw2 +1 


Problem Set 11.9, page 533 


3. i(e OY — e*”)/(wV271) if a < b; 0 otherwise 

5. [et eune _ g Poy iar — iw)) 

7. (e (1 + iaw) — 1I)/(V2atw?) 9. V2/at(cos w + wsinw — 1)/w? 
11. iV2/7 (cos w — 1)/w 13. e~”/? by formula 9 


App. 2. Answers to Odd-Numbered Problems A31 
17. No, the assumptions in Theorem 3 are not satisfied. 


WlAththth, A-ita-bt ifs A-hth-Sa At fe — fs - fal 
1 1 ‘] 7 Ath 
fe 


fi-ha 
Chapter 11 Review Questions and Problems, page 537 


21. 


I =] 


11.14 (sin + —sin + Sin +) 


13. ~ 25 (cos me + Leos are + 3 cos Sax + St 


i 
4 
| a 1. oe 

— | sin 7x — = sin 27rx + — sin 377xX — +::: 
T 2 3 


15. cosh x, sinh x (—5 < x < 5), respectively 17. Cf. Sec. 11.1. 
1 4 1 2 1 

19. 4, (cos as + — cos 37rx + ~), 2 (sin ax — =sin 27x + -.--) 
2 9 T 2 


cos ¢ 1 cos 2t + 1 cos 3t 


w=1 4 ef =4 9 @ =9 


21. y = Cycos wt + Cysin wt + 7, 1o( 


1 cos 4t ) 
16 w* — 16 
23. 0.82, 0.50, 0.36, 0.28, 0.23 
25. 0.0076, 0.0076, 0.0012, 0.0012, 0.0004 


1 “(cos w + wsinw — 1)cos wx + (sin w — wcos w)sin wx 
27. 5 dw 
w 


wn 
0 


29. 2/7 (cos aw — cos w + aw sin aw — w sin w)/w? 


Problem Set 12.1, page 542 


1. L(cyuy att C2U9) = cyL(uy) a coL(ug) =C1* O+ co° 0=0 


3.c =2 5.c =a/b 
7. Any c and w 9.c = 7/25 
15. uw = 110 — (110/In 100) In (x2 + y?) 17. u = a(y) cos 4arx + b(y) sin 4arx 


19. u = c(x) ew ¥3 

21. u = e 8%(a(x) cos 2y + b(x) sin 2y) + 0.1e4 

23. u = cx(y)x + co(y)/x” (Euler-Cauchy) 

25. u(x, y) = axy + bx + cy +k; a, b, c, k arbitrary constants 


Problem Set 12.3, page 551 
5. k cos 37rt sin 37rx 


k 1 l 
7. 3 (cos Tt sin 7x + = cos 37 sin 37x + —Z cos 57 sin 57x + ++: ) 
T 27 125 


9. us : (cos Tt sin 7x — —cos 37rt sin 377x + => cos Sat sin 57x — +--- 
o 9 25 


A32 


App. 2. Answers to Odd-Numbered Problems 


11. 2,(@ — V2) cos 7t sin 7x — 5@ + V2) cos 37 sin 37rx 


+3e0+ V2) cos 5at sin 57x — +) 
44+ 37 


4 : ; : 
13. (a — 7) cos Tt sin 7x + cos 27 sin 27rx + cos 37t sin 377x 


4 — 57 
125 


2 2 2 
17. u = 5 (cos (=): Se a cree (32): sin 72 4...) 
T L E 3 L L 


19. (a) u(0, t) = 0, (b) u(Z, t) = 0, (c) u,(0, t) = 0, (d) uz(Z, t) = 0. C A, D B 
from (a), (c). Insert this. The coefficient determinant resulting from (b), (d) must be 
zero to have a nontrivial solution. This gives (22). 


cos 57rt sin 57x + ), No terms with n = 4, 8, 12,---. 


Problem Set 12.4, page 556 


3. c2 = 300/[0.9/(2 - 9.80)] = 80.837 [m?/sec”] 
9. Elliptic, uw = fy(y + 2ix) + foly — 2ix) 

11. Parabolic, u = xfi(x — y) + fo(x — y) 

13. Hyperbolic, u = f(y — 4x) + foly — x) 


1 
15. Hyperbolic, xy’? + yy’ = 0, y = 0, xy = W, Uy = zu yhiay) + f(y) 


17. Elliptic, u =fi(y — 2 — dx) + foly — (2 + Dx). Real or imaginary parts of any 
function u of this form are solutions. Why? 


Problem Set 12.6, page 566 


; -t : —4t 
3.u, =sinxe , Ug =sin2xe ~, 


. = 2. 
5u=snticke Oo 


800/ . = 1. = 
Las (si O.Larx e7 O01 TZ Tq sin 0.377x € O.O17628me¢ + J 
7 3 


ug = sin 3x e~™* differ in rapidity of decay. 


9. u = uy + uy, Where uyy = u — uy Satisfies the boundary conditions of the text, 


L 
7 2 NX 2. 2 . NWTX 
so that u,, = > By sin 7 Cane i = 2/ [f() — w,@)] sin — dx. 


n=1 0 

11. F = Acospx + Bsinpx, F’(0)=Bp=0, B=0, F’(L) = —ApsinpL = 0, 

p =nt/L, etc. 
13. u = 

1 4 1 1 
15. 5 + 4 (cosxert + 9 °°8 3x e 8 + 35 098 5x e@2Ot + J 

Kir < 2 
17. — nB,e 0" 
n=1 


19. uw = 1000 (sin § 7x sinh 3 7Ty)/sinh 7 


sin 


100 & 1 _ Qn —lymx . | Qn — lymy 
21.0 = h 
Ee Ope ae Ir 240° (4 


n=1 
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sinh (n7rx/24) nTry 
sinhnt 24” 


23. u = Agx + ®, Ay 
n=1 
1 24 1 24 7 
n 
Ao = ;| f(y) dy, An = | f(y) cos “> dy 
24° Jy 12 Jy 24 
= _ niwx .. ni(b—y) 2 
25. S An sin 7 sinh 7 An = @ sinh (wmb/a) 


n=1 


| fQo sin ax 
a 
0 


Problem Set 12.7, page 574 


2[~ 2 ‘ Qa 
3.A = | aan pest a - | e P-© Pt cos px dp 
TJ, 1 rv T 2 0 
1 
2 2 + psinp — 1 
s.4=2 | UV cos pu dv = poe pee . 6te. 
7 7 7) 
0 
2 (“si 2 
A= ZI my cos pu dv = —- = Lif0<p< land Oifp > 1, 
1 
u= | cos px e~° ? ' dp 
0 
9. Set w = —v in (21) to get erf (—x) = —erf x. 


13. In (12) the argument x + 2czV‘ is 0 (the point where f jumps) when z = —x/ (2cV‘). 
This gives the lower limit of integration. 
15. Set w = s/V2 in (21). 


Problem Set 12.9, page 584 


1. (a), (b) It is multiplied by V2. (c) Half 

5. Bmn = (—1)"*18/(mn7r?) ifm odd, 0 if m even 
7. Binn, = (—1)™*"4ab/(mntr”) 

11. u = 0.1 cos V20¢ sin 2x sin 4y 

13. ss > yY ! 3 COS (Vm? + n”) sin mx sin ny 


T mn 


17. c77V260 (corresponding eigenfunctions F416 and F16,14), etc. 
4 ‘ 4 
19. cos (a = + ) sin LE sin ups 
a b a b 


Problem Set 12.10, page 591 


5. HO" Goede 088 ee ere —+::-) 
T 3 5 


7. 5577 = ced 9? costd += 2p cee Ss ree) 
T 9 25 


A34 


App. 2. Answers to Odd-Numbered Problems 


11. Solve the problem in the disk r < a subject to ug (given) on the upper semicircle 
and —ug on the lower semicircle. 


“= 4uo(£ sino + 9 a9 alg 29? Gin'50 + -) 
7 \a 3a Sa 
13. Increase by a factor V2 15. T = 6.826pR7f% 
17. No 25. a41/(277) = 0.6098; See Table Al in App. 5. 


Problem Set 12.11, page 598 


5. Aq = Ag = Ag = Atnp = 0, As = 605/16, Az = —4125/128, Ag = 7315/256 
9. V2u =u" + 2u'/r=0, u"/ul = —2/r, In|u’| = -2In(7| + cy, 
ul =Eé/r?,u=c/rt+k 
13. u = 320/r + 60 is smaller than the potential in Prob. 12 for 2 <r < 4. 
17.u=1 
19. cos 2 = 2 cos” b —1,2w7-1= $ Pow) 3 u $r°’Po(cos db) - $ 
25. Set 1/r = p. Then u(p, 0, 6) = rv(7, 0, b), up = (VU + rv,)(—1/p), 
Upp = (Qe + Wer)(1/p*) + (V + rvx)(2/p*), Upp + (2/p)p = rrr + (2/Nvy). 
Substitute this and ugg = rUgg etc. into (7) [written in terms of p] and divide by r°. 


Problem Set 12.12, page 602 


c(s) x x 

oo s*(s + 1) 

7. w = f(x)g(t), xf'g + fe = xt, take f(x) = x to get g = ce™' + t— 1 andc = 1 from 
w(x, 0) = x(c — 1) = 0. 

11. Set x?/(4c?7) = 2” Use z as a new variable of integration. Use erf(~) = 1. 


5.W= , WO, s) = 0, c(s) = 0, wx, ) = x(t — 1 +e) 


Chapter 12 Review Questions and Problems, page 603 


17. u = cy(xe~ 4 + co(x)e*4 — 3 19. Hyperbolic, f(x) + foly + x) 
21. Hyperbolic, fi(y + 2x) + foly — 2x) 23. 3 cos 2t sin. x — z cos 6f sin 3x 
25. sin 0.01 77x e7 0-001143¢ 

27. 2 sin 0.01 rx e~ 9-001143t _ & sin 0.03 rx e~ 0010291 

29. 100 cos 2x e~** 

39. uw = (uy — ug)(Inr)/In (74/79) + (Ug In ry — uy In rp)/In (7/79) 


Problem Set 13.1, page 612 


Liar as, Year si 3.4.8 — 1.4: 
5.x -—iy=—-(x+ iy), x=0 9.-117, 4 
11. —8 — 6i 13. —120 — 40: 
15.3 -i 17. —4x7y? 


19. (x? — y?)/ (x? + y”), Qxy/(x? + y?) 


Problem Set 13.2, page 618 


1. V2 (cos zm + ising7) 
3. 2(cosg77 + ising7), 2(cosg7 — ising7) 


App. 2. Answers to Odd-Numbered Problems A35 


5. 3(cos 7 + isin 77) 7.V1+ 47” (cos arctan 57 + isin arctan 577) 
9. 37/4 11. +arctan ($) = +0.9273 

13. —1024. Answer: 7 15. —3i 

17.2 + 2i 21. W/2 (cos ka + isingym), k= 1,9,17 

23.6, =3 2 3V3i 

25. cos (gm + gkm) + isin(ga + 5km), k=0,1,2,3 

27. cos47 + ising7, cos?m + isin?a, —1 

i: =1=% 31.+(1 - 1), +(2 + 2i) 

33. |z1 + zal” = (z1 + z2)(Z1 + Za) = (21 + Z2\(Z1 + Za). Multiply out and use 


35. 


Re z1Z2 S |z1Z| (Prob. 34). 

z3Z1 + 2Za + zat + zaze = 1241/2 + 2Rezize + Izol? S Iz, /? 

+ 2lzillzel + Iz2l” = (|zi| + [zal)*. Hence |z, + z(? <(\z1| + lzoI)?. Taking 
square roots gives (6). 

[v1 + x2)” + (yn + yay") + [Orr — x2)? + (yn = y2)"] = 207 + yt + xB + YB) 


Problem Set 13.3, page 624 


1. 
3. 
5. 


15. 
17. 


Closed disk, center —1 + 5i, radius 3 

Annulus (circular ring), center 4 — 2i, radii 7 and 377 

Domain between the bisecting straight lines of the first quadrant and the fourth 
quadrant. 


. Half-plane extending from the vertical straight line x = —1 to the right. 
u(x, y) = (1-0/1 - 9 + y*), wl, -1) = 0, 


v(x, y) = (1 — x)? + y9), v0, -1) = -1 
Yes, since Im(|z|?/z) = Im(|z|?Z/(zz)) = Imz = —rsin@ >0. 
Yes, because Re z = rcos 9 —>0 and 1 — |z| ~l asr—0. 


19. f'(z) = 8(z — 4i)’. Now z — 4i = 3, hence f’(3 + 4i) = 8 - 3” = 17,496. 


21. 


nl — zhi, ni 23. 3iz2/(c + 4, —3i/16 


Problem Set 13.4, page 629 


1. 


ry = x/r = cos0, ry = sind, 6, = —(sin)/r, Oy = (cos 6)/r 
(a) 0 = uy — Vy = u,cos 6 + ue(—sin 6)/r — v,sin 8 — ve(cos @)/r 
(b) 0 = uy + vy = usin 6 + ug(cos @)/r + v, cos 8 + ve(—sin 6)/r 
Multiply (a) by cos 6, (b) by sin 6, and add. Etc. 


3. Yes 5. No, f(z) = (z”) 

7. Yes, when z # 0. Use (7). 9. Yes, whenz #0, —277i, 277i 
11. Yes 13. f(z) = $s i(z” +c), c real 
15. f(z) = 1/z + c (c real) 17. f(2) = Y+zte (c real) 
19. No 2a=7, v =e" sin Ty 


23. 
29. 


a=0, v=43b(y2—x%) +c 27.f=u +t i implies if = —v + iu. 
Use (4), (5), and (1). 


Problem Set 13.5, page 632 


3. 
7. 
11. 


e2tig-27 — o-27 — 9.001867 5. e*(—1) = —7.389 
ev 2} = 413i 9. 5e arctan (3/4) _ 5e0-64% 


6.32" 13. V2e7"/4 


A36 


App. 2. Answers to Odd-Numbered Problems 


15. 
17. 
19, 


exp (x? — y’) cos 2xy, exp (x? — y?) sin 2xy 
Re (exp (z3)) = exp (x? — 3xy”) cos (3x2y — y3) 
z= 2nTri, n=0,1,:-:: 


Problem Set 13.6, page 636 
1. Use (11), then (5) for e, and simplify. 7. cosh 1 = 1.543, i sinh 1 = 1.1757 


9. 
15. 
17. 


Both —0.642 — 1.0697. Why? 11. i sinh 7 = 11.55i, both 
Insert the definitions on the left, multiply out, and simplify. 
z= £(2n + 1)i/2 19. 2 = tn7i 


Problem Set 13.7, page 640 


5. 
9. 
13. 


15. 


17. 
19, 
21. 
23. 
25. 
27. 


In ll + wi 7. ¥1n 32 — qi/4 = 1.733 
i arctan (0.8/0.6) = 0.927i M1. Ine + wi/2 = 1 + wi/2 
+2nTri, n=O, 1,-:: 


i : sin | 
In Je*| + i arctan 
cos | 


+ 2nti =0+ i+ 2n7ri, n=0,1,-:: 


In (i?) = In(~1) = (1 + 2n)mi, 2Ini = (1 + 4n)ti,n = 0,1,--- 
e*-* = 6% (cos 3 — isin3) = —54.05 — 7.70i 
0.60.4 — 696 (cos 0.4 + isin 0.4) = 1.678 + 0.710i 


e 

elo Ln d+) _ elnv2+ 7i/4—i Inv2+7/4 _ 2.8079 + 1.3179: 

eS—hln 3+) — 9767 (cos (377 — In3) + isin (3a — In3)) = —284.2 4 
e2-) Ln(-)l) _ eon = em = 23.14 


Chapter 13 Review Questions and Problems, page 641 


1.2 — 3i B27 4G OO. “T6lbe 
11. —5 + 127 13. 0.16 — 0.12i 
15. i 17.4V2e 8°" 
19. 15e77"/2 21.43, +33 
23. (+1 + d/V2 25. f(z) = —iz?/2 
27. f(z) = e"™ 29. f(z) =e? /? 
31. cos 3 cosh 1 + isin3 sinh 1 = —1.528 + 0.166 
33. i tanh 1 = 0.7616i 
35. cosh 7 cos 7 + isinh 7 sin 7 = —11.592 


Problem Set 14.1, page 651 


. Straight segment from (2, 1) to (5, 2.5). 

. Parabola y = x” from (1, 2) to (2, 8). 

. Circle through (0, 0), center (3, —1), radius 10, oriented clockwise. 
. Semicircle, center 2, radius 4. 

. Cubic parabola y = x2 (—2 SxS 2) 

-cajp=tt+2+ni (-lstZl) 

ct) =2-it+2e (OSt=7) 


0.7851 


556.41 


App. 2. Answers to Odd-Numbered Problems A37 
15. z(t) = 2 cosht + i sinh t(—% <t< 0) 
17. Circle 7) = —a —ib+ re" (0StS27) 
19.2) =t+(1—-4ri (-2 S182) 
2. =(1+i)¢ (St83), Rez=t, 2 ()=1 + i. Answer: 4 + 4i 
23. 677 — e™ =1-—(-1)=2 
25. xexp ae s(e} e') = —sinh 1 
27. tan gi — tang = itanhg — 1 
29. Im z? = 2xy = 0 on the axes. 2 =1+(-1 +a) OSSD), 
(Im 2”) 2 = 2(1 — Ay(—1 + 3) integrated: (—1 + i)/3. 
35. |Re z| = |x| $3 = MonC,L = V8 
Problem Set 14.2, page 659 
1. Use (12), Sec. 14.1, with m = 2. 3. Yes 5.5 
7. (a) Yes. (b) No, we would have to move the contour across +2i. 
9. 0, yes 11. 7ri, no 
13. 0, yes 15. —7, no 
17. 0, no 19. 0, yes 
21. 277i 23. 1/z + 1/(z — 1), hence 277i + 277i = 477i. 
25. 0 (Why?) 27. 0 (Why?) 
29. 0 
Problem Set 14.3, page 663 
1. 277iz?/(z — 1)|z--1 = — Ti 3.0 
5, 27ri(cos 3z)/6|z-9 = Ti/3 7. 2mi(i/2)?/2 = 17/8 
1 T 
- Ls = . (z+ 2=2 = j 
11. 277i z+ 2il,_4 2 13. 277i(z + 2)|,-9 = 877 
15. 277i cosh (-a? — Ti) = —27ri cosh 1” = —60,739i since cosh 7i = cos 7 = —1 
and sinh vi = isin 7 = 0. 
Ln (z + 1) Ln (1 + 2) 
17. 277i - = 277i = m(InV2 + i/4) = 1.089 + 2.467: 
v6 et 2-1 2i 
19. 27rie”*/(2i) = me” 
Problem Set 14.4, page 667 
1. (277i/3!)(—cos 0) = —7i/3 3. (2ari/(n — 1)!e° 
5. AT (cosh 22)" = = - 8 sinh 1 = 9.845; 


7. (2ari/(2n)!) (cos J?” |,~9 = (27ri/(2n)!\(— 1)" cos 0 = (—1)"277i/(2n)! 


9, —27ri(tan 7z)' 


11. 


—2T71* 7 ; 
= -2n7i 


z=0 cos” ITZ 


z=0 


= g7ri(sin z + (1 + z) cos 2le-12 


ml + 2)sin 2)' 


z=1/ 


= 5 Tri(sin 5 a 3 cos 5) 
= 2.8211 


A38 App. 2. Answers to Odd-Numbered Problems 


13. 277i - , | = Ti 15. 0. Why? 


z=2 
17. 0 by Cauchy’s integral theorem for a doubly connected domain; see (6) in Sec. 14.2. 
19, (227i/2!)4-(e**)" |,- nia = —9T(1 + d/(64-V2) 


Chapter 14 Review Questions and Problems, page 668 


21. § cosh (—477”) — 5 = 2.469 
23. 2mri(e*) |, 9 = ie*/12|,-9 = Ti/12 by Cauchy’s integral formula. 


25. —2zri(tan 17z)'l,-1 = —277i/cos” 2,1 = —2777i 
27. 0 since z2 + Zz — 2 = 2x? — y”) andy = x 
29. —4A7i 


Problem Set 15.1, page 679 


1. 2, = (2i/2)”; bounded, divergent, +1, +i 
3. Zn = —$77i/(1 + 2/(ni)) by algebra; convergent to —7i/2 
5. Bounded, divergent, +1 + 107 
7. Unbounded, hence divergent 
9. Convergent to 0, hence bounded 
17. Divergent; use 1/Inn > I/n. 19. Convergent; use > 1/n”. 
21. Convergent 23. Convergent 
25. Divergent 
29. By absolute convergence and Cauchy’s convergence principle, for given e > 0 we 
have for every n > Me) and p = 1, 2,--- 


lZn-eal ae ee ae ae <€, 
hence |zn44 + °° + | < e€ by (6*), Sec. 13.2, hence convergence by Cauchy’s 
principle. 
Problem Set 15.2, page 684 


1. No! Nonnegative integer powers of z (or z — Zg) only! 
3. At the center, in a disk, in the whole plane 
5. Yanz2" = Da,(z2)", |z2| < R = lim |ay/an+1\|; hence |z| < VR. 


7. 1/2, © 9. i,V3 11. 0,28 
13. —i, 5 15. 2i, 1 17. 1/V2 


Problem Set 15.3, page 689 


3. f= Wan. Apply l’H6pital’s rule to Inf = dn n)/n. 
5.2 7. W3 9. 1/V2 
1s 13. 1 15.3 


Problem Set 15.4, page 697 


App. 2. Answers to Odd-Numbered Problems 


5.3 a +4 sz’? + ane ert. AS W2 
1 {4 i 4 

24 =] fe R= 
oe Ft Ded” Beg ® ; . 

1. 3,165 

9, - 5? + = + ag = 
| ( 3! )a 2 67 + 49% a 
0 

11, 27/(1!3) — 27/31) + V/GNI- +-+-, R= 

13. (2/Vay(z — 23/3 + 22/215) — 2"/GBI7) + +++), R= 

17. Team Project. (a) (Ln(1 + 2)’ =1-—zt+2- +++: =1/ +2. 


(c) Use that the terms of (sin iy)/(iy) are all positive, so that the sum cannot be zero. 


19.34 51+ 3 -)+(-4+40@-)? -4@—-D2+-:-, R= V2 
2 4 6 
1 1 1 1 1 1 
ae i (e wr) + 2(c vr) i (e br) tom, ae 
23. -4 - 2c —) + He —- I? + Bic — I? - Ac - d* R=2 


3 3 5 5 
28.2(< si) +3 (< i) — (< i) + +++, R 
2 3! 2 5! 2 


Problem Set 15.5, page 704 


3.|z+ i] S$ V3 -6, 6>0 

5.|z+3i| 34-6, 5>0 

7. Nowhere 

9. |z-—2i| =2-8, 6>0 

11. |c”| S 1 and }1/n? converges. Use Theorem 5. 
13. |sin” |z|| S 
15. R = 4 by Theorem 2 in Sec. 15.2; use Theorem 1. 
17. R = 1/V7 > 0.56; use Theorem 1. 


Chapter 15 Review Questions and Problems, page 706 


11. 1 13.3 
15.5 17.%, e* 
19. 2%, cosh Vz 
oo in 
21. 5 ——_,, R=» 
4 (Qn 1! 
oa oe Ge) ae 
23.— + 2z=1+ 22", R= 
Hig gees 2 2 ny 2) “ 
n=1 
ai 2n-2 
25. Der ae Zz », R=2 
n=1 
27. cos [(z — dar) + Br] = -(¢ — Ar) +3 — 4m? - + 
1 1 2 1 3 
: + + —3y- + 
29.In3 + 3@— 3) - 5 9G - 3 + eG - DD 


1 for all z, and }1/ n converges. Use Theorem 5. 


—sin (z — 377) 


A39 


A40 App. 2. Answers to Odd-Numbered Problems 


Problem Set 16.1, page 714 


1:27 ao + wore + tee, 0 < |z| < 2 
3.23 4c +424 §2+He? tee, O< [z|< 
Bot tc tt ltztete, O< [ze] <1 
12 +d24 fe beg bo, O< |Z ee 
9, exp [1 + (2-— DI@- 17 =e [@- D7 +@-)"*+5+8@-VYDt-], 
0<|z-1|< 
a [e+e —TOP . Gi’ Qi 1 
; (z — ai)* (-mi)* (-mi® (¢- i 
\—-3 oo 
13.1791 + 2—") «- i)? = > ( "Vie — a= ie 9 
; n=0 nl 
—3z-ip) 1-6 +10z-D+°, O<[z-il <1 
15. (—cos (z — 77)\(z — )72 = -(z mW) 2+3 a (z Te toe, 
0<|z-a| < 
1 el, =, — Iz] >1 
n=0 n=0 < 
21. —(¢ + 377) ‘cos (z + 9m) = —( + 9m) 1+ 50 + aT) — sae tem) t+ 
lc +37| >0 
23, 28 + 72 4 716+...) [2] <1, ge-1-zgt#—-z 8... |g >1 


ee ee ee 


25.>—— 
Z-i zZ-i 


Section 16.2, page 719 


1.0 + 27, +477,---, fourth order 3. —81i, fourth order 
5. +1, £2,---, second order 7. £(2 + 2i), +i, simple 
9. ¥sin 4z, z = 0, +77/4, +77/2,---, simple 


11. f(@) = (< — 20)"g(2), a(Zo) # 0, hence f7(z) = (z — z0)?"g(z). 

13. Second-order poles at i and —2i 

15. Simple pole at ©, essential singularity at 1 + i 

17. Fourth-order poles at +n7ri,n = 0, 1,---, essential singularity at o 

19. e*(1 — e*) = 0, e* = 1, z = £2nz77i simple zeros. Answer: simple poles at +2n7ri, 
essential singularity at 

21. 1, © essential singularities, +2n7ri, n = 0, 1,---, simple poles 


Section 16.3, page 725 


3. 75 at 0 5. +4iat Fi 
7. 1/q at 0, £1,-°- 9. —1 at +2n77i 
11. (e*)" /2!|,-94 = —g at z= Ti 


15. Simple pole at z inside C, residue —1/(27r). Answer: —i 

17. Simple poles at 77/2, residue e”/?/(—sin 77/2), and at —71/2, residue 
e~7/?/sin 17/2 = e~ 7/2. Answer: —477i sinh 77/2 

19, 277i (sinh 51)/2 = —7 sing 

21. <-° cos mz = +++ + 14/(4lz) — +++. Answer: 277/24 


App. 2. Answers to Odd-Numbered Problems A41 


23. Residues 5 atz = a 2atz= 3. Answer: 57i 
25. Simple poles inside C at 2i, —2i, 3i, —3i, residues (2i cosh 2i)/(4z* + 26z)|2—9; = 
db. o> i> i> respectively. Answer: 277i * to 


Problem Set 16.4, page 733 


1. 27/Vk? - 1 3. 7/V2 
5. 52/12 7. 2atr/V a? — 1 
9. 0. Why? (Make a sketch.) 11. 7/2 
13. 0. Why? 15. 77/3 
17. 0. Why? 
19. Simple poles at +1, i (and —i);2ai + gi + Ti(-4 +4) = —5T77 


21. Simple poles at 1 and +27ri, residues i and —i. Answer: . (cos 1 — e~?) 


23. —77/2 25. 0 
27. Let q(z) = (z — a1)(z — de)*+* (z — ax). Use (4) in Sec. 16.3 to form the sum of 
the residues 1/q'(ay) + --- + 1/q'(a,) and show that this sum is 0; here k > 1. 


Chapter 16 Review Questions and Problems, page 733 


11. 677i 13. 277i(—10 — 10) 

15. 277i(25z)'|,-5 = 5007 17. 0 (n even), (—1)"~ /?277i/(n — 1)! (n odd) 
19. 77/6 21. 77/60 

23. 0. Why? 25. Res e/(z7 + 1) = 1/(Qie). Answer: 1/e. 


Problem Set 17.1, page 741 


5. Only in size 
Tx=cw=-ytic; y=kw=—-k-+ ix 
9. Parallel displacement; each point is moved 2 to the right and 1 up. 
11. |w| $4, —7/4 < Argw < 77/4 13. -5 SRezS -2 
15.u21 17. Annulus 5 S |w| S$ 4 
19.0<u<In4, 7/4<v0 $37/4 
2. +acr>t+bet+c, 2= —-3(a + Va? — 3b) 
23. z = (-1 + V3)/2 
25. sinhz = Oatz=0, +771, +277,::: 
29. M = |z| = 1 on the unit circle, J = Iz|? 
31. |w’| = 1/|z|? = 1 on the unit circle, J = 1/|z|* 
33. M = e” = 1 for x = 0, the y-axis, J = al 
35. M = 1/|z| = 1 on the unit circle, J = 1/|z|? 


Problem Set 17.2, page 745 


+i +i 
Iga : igo : 
2w —3iw + 1 


11.2=0, 1/(a + ib) 13.2=0, +3,+ = +i/2 


A42 App. 2. Answers to Odd-Numbered Problems 


az+b 
19. w= 


15. z = i, 2i 17.w = = ——— 
ae io cz ta —bz+a 


Problem Set 17.3, page 750 
3. Apply the inverse g of f on both sides of z; = f(z1) to get g(z1) = g(f(z1)) = Z1. 


9. w = iz, a rotation. Sketch to see. I. w = (z + iI/(z — i) 
13. w = 1/z, almost by inspection 15. w = 1/z- 1 
17. w = (2z — i/(-iz — 2) 19. w = (z* — i(-iz* + 1) 


Problem Set 17.4, page 754 


1. Circle |w| = e° 3. Annulus 1/Ve = |w| S Ve 
5. w-plane without w = 0 7.1< |wl<ev>0 
9, +(2n + 1)7/2, n=0,1,-°: 
11. u?/cosh? 2 + v?/sinh?2 <1, u>0,v>0 
13. Elliptic annulus bounded by u?/cosh” 1 + v?/sinh? 1 = 1 and 
u2/cosh? 3 + v?/sinh? 3 = 1 
15. cosh z = cos iz = sin (iz + 577) 
17. 0 < Imt < 7 is the image of R under ¢ = z”/2. Answer: e* = or 
19. Hyperbolas u?/cos” ¢ — v/sin? ¢ = cosh” c — sinh? c = 1 when c # 0, 77, and 
u = cosh y (thus \u| = 1),v = 0 when c = 0, 77. 
21. Interior of u2/ cosh? 2 + v?/ sinh? 2 = 1 in the fourth quadrant, or map 
T/2<x<7,0<y <2 by w = sinz (why?). 
23.u <0 
25. The images of the five points in the figure can be obtained directly from the 
function w. 


Problem Set 17.5, page 756 


1. w moves once around the circle |w] = 5. 
3. Four sheets, branch point at z = —1 

5. —i/4, three sheets 

7. Zo, n Sheets 


9. Vz(z — i\(z + i), 0, ti, two sheets 


Chapter 17 Review Questions and Problems, page 756 


11.1 < |w| <4, larg w| < 7/4 13. Horizontal strip —8 <v < 8 
5.u=1- qv", same (why?) 17. |w| > 1 
19.4 < |w| <§, v<0 21.w=1+iuw, v<0O 
10z + 5i : . 
23. w = ———— 25. Rotation w = iz 
z+ 2i 
27. w = 1/z 29.z =0 
Sig = 24/6 33.2 = 0, it, £31 
35. w = e* 37.w = ize +1 
39. w = z7/(2c) 


App. 2. Answers to Odd-Numbered Problems A43 


Problem Set 18.1, page 762 
1.2.5mm = 0.25cm; ® = Re 110(1 + (Lnz)/In 4) 


20 
3. 0 = Re(30 - 20 ins) 


5. ® (x) = Re (375 + 25z) 
7. P(r) = Re (2 — z) 
13. Use Fig. 391 in Sec. 17.4 with the z- and w-planes interchanged and 
cos z = sin(z + 371). 
15. ® = 220(x? — 3xy”) = Re (220z?) 


Problem Set 18.2, page 766 


3.w = iz maps R onto the strip —2 S u S 0; and ®* = Uy + (Uy — Us) + 5U) = 
Uz + (U, — U2) — xy). 
(x — 2)(2x — 1) + 2y? 
@-2 +9 
7. See Fig. 392 in Sec. 17.4. ® = Re(sin”z),_ sin?x(y =0), sin? x cosh? 1 — cos? x 

sinh? 1(y = 1), —sinh? y (x = 0, 77). 
9. B(x, y) = cos” x cosh” yr sin” x sinh” y; cosh” y (x = 0), —sinh y « = 9), 
cos” x (y = 0), cos” x cosh? 1 — sin? x sinh? 1 Gy =1) 
13. Corresponding rays in the w-plane make equal angles, and the mapping is conformal. 
15. Apply w = ae 
17. z = (2Z — i)/(—iZ — 2) by (3) in Sec. 17.3. 


2) Si 
19. ® = alee = 2), FH = late = 2) 


5. (a) =, (b) x2 — y2 =c, wyw=c, e“cosy=c 


Problem Set 18.3, page 769 
1. (80/d)y + 20. Rotate through 77/2. 


80 y 80 
5. a arctan ~ = Re(- a Ln:) 


2 y 2i 
7.7; + Us a T)) arctan > = Re( 7 = ale = ni) Ln) 


q, y , i za 
9. 7 (arctan y= ph ~ aretan -— 3) Re(2 Ln = *) 

100 100i, zt+l1 
11, 100 (arg (@ = 1) ~ Arg @ + D) = Re( = ini“) 


13, 100 [Arg (22 — 1) — Arg (z” + 1)] from w = 2” and Prob. 11. 


15. —20 + (320/m) Arg z = Re( 20 3201 Ln :) 
17. Re F(z) = 100 + (200/77) Re (arcsin z) 


Problem Set 18.4, page 776 


1. V(z) continuously differentiable. 
3. |F(iy)| = 1 + 1/y?, |y| = 1, is maximum at y = +1, namely, 2. 


A44 


App. 2. Answers to Odd-Numbered Problems 


19. 


. Calculate or note that V? = div grad and curl grad is the zero vector; see Sec. 9.8 and 


Problem Set 9.7. 


. Horizontal parallel flow to the right. 
_ 4 
»F@)=2z _ 
. Uniform parallel flow upward, V = F’ = ik, 4 = 0, = K 
FQ) = 2 
» F(2) = 2/to + ro/z 
. Use that w = arccos z gives z = cos w and interchanging the roles of the z- and 


w-planes. 
y/(x? + y*?) = corx? t+ (y-khP =k? 


Problem Set 18.5, page 781 


5. 
7. 
9. 


11. 


13. 


15. 


17. 


® = 3 r? sin 30 

® = 3a + gar* cos 80 

® = 3 — 4r? cos 20 + r*cos 40 
2 1 1 

m= 2 (sin —=r*sin 20 + =r? sin 30 — + -) 
7 2 3 


@ =~ rsing + <r? sin 26 — —-r? sin 30 — or*sin 49 + + —--- 
2 Oar 4 


7 
=F +2 (reos9 — tr%c0s 30 + r%c0s50 = +) 
2 T 3 5 
1 4 l 5 1 3 
pt eae a Eee eae eee 
co) so (rcos¢ ae cos 26 9” cos 30 ) 


Problem Set 18.6, page 784 


1. Use (2). F(zo +e”) = & + e%)%, etc. F(R) = 33° 
3. Use (2). F(zg + e%) = (2 + 3c"), etc. F(4) = 100 
5. No, because |z| is not analytic. 
1-27 
7. ©0,-9=-3=4/ | (d+ rcosa)(—3 + rsina)rdrda 
0-0 
1.27 
=i | (3+ )drda=1(—3). 2m 
T 7 2 
0-0 
1 1-27 
§..0(1, 1) = 3 = =| | 3+rcosa+rsina + r2cos asin a)rdrda 
0-0 
1 3 
Sea oT 
13. |F(2)| = [cos? x + sinh? y]/2,. z= +i, Max = [1 + sinh? 1]? = 1.543 


17. 
19, 


. |F(|? = sinh? 2x cos” 2y + cosh? 2x sin? 2y = sinh? 2x + 1+ sin? 2y, z= 1, 


Max = sinh 2 = 3.627 
|F(2)|? = 4(2 — 2.cos 26), z= 77/2, 37/2, Max =4 
No. Make up a counterexample. 


App. 2. Answers to Odd-Numbered Problems A45 
Chapter 18 Review Questions and Problems, page 785 
1. ® = 101 =x+y), F=10— 100 + dz 


13. ® = Re (220 — 95.54 Ln z) = 220 sO Inr = 220 — 95.54 Inr. 
n 


17. 2(1 — (2/77) Arg z) 

19. 30(1 — (2/7) Arg (z — 1)) 

21.0 =x+y-=const, V= F(z) =1-i, parallel flow 
23. T(x, y) = x(2y + 1) = const 

2.F (2 =z+l=x+1-jy 


Problem Set 19.1, page 796 


1. 0.84175 - 107, —0.52868 - 108, 0.92414. 107-3, —0.36201 - 10° 
3. 6.3698, 6.794, 8.15, impossible 
5. Add first, then round. 
7. 29.9667, 0.0335; 29.9667, 0.0333704 (6S-exact) 
9. 29.97, 0.035; 29.97, 0.03337; 30, 0.0; 30, 0.033 
11. Jel = |x ty —@& +H) = la —-*) + (y- PI = lex + ey| 
= |e,| + ley] = B. + By 


ay a, + € ad, + €& €9 re ay Ey €9 ay 
13. =x =— La + tere Je + —- : 
ag, dg + €9 d2 a2 a3 d2 d2 dag a2 
a ay ay €, €2 
hence = = Sle + |é,o| S + 
dp ae da ay de | rl | | 72, | Bri Bro 


15. (a) 1.38629 — 1.38604 = 0.00025, (b) In 1.00025 = 0.000249969 is 6S-exact. 

19. In the present case, (b) is slightly more accurate than (a) (which may produce 
nonsensical results; cf. Prob. 20). 

21.c4- 24 +++) top 2° = (1011 1.)o, NOT(11101.)o 

23. The algorithm in Prob. 22 repeats 0011 infinitely often. 

25. n = 26. The beginning is 0.09375 (n = 1). 

27. I14 = 0.1812 (0.1705 4S-exact), 113 = 0.1812 (0.1820), Tyg = 0.1951 (0.1951), 
Ty, = 0.2102 (0.2103), etc. 

29. —0.126 - 1077, —0.402 - 107%; —0.266 - 107®, —0.847 - 1077 


Problem Set 19.2, page 807 


3.g =0.5cosx, x = 0.450184 (= x49, exact to 6S) 
5. Convergence to 4.7 for all these starting values. 
7.x = x/(e” sin x); 0.5, 0.63256, --- converges to 0.58853 (5S-exact) in 14 steps. 
9.x =x*— 0.12; x9 = 0,x3 = —0.119794 (6S-exact) 
11. g = 4/x + x3/16 — x°/576; x9 = 2,xXn = 2.39165 (n = 6), 2.405 48-exact 
13. This follows from the intermediate value theorem of calculus. 
15. x3 = 0.450184 
17. Convergence to x = 4.7, 4.7, 0.8, —0.5, respectively. Reason seen easily from the 
graph of f 


A46 


App. 2. Answers to Odd-Numbered Problems 


19. 0.5, 0.375, 0.377968, 0.377964; (b) 1/V7 
21. 1.834243 (= x4), 0.656620 (= x4), —2.49086 (= x4) 
23. x9 = 4.5, x4 = 4.73004 (6S-exact) 
25. (a) ALGORITHM BISECT (f, ao, bo, €, N) Bisection Method 
This algorithm computes the solution c of f(x) = 0 (f continuous) within the 
tolerance €, given an initial interval [ag, bo] such that f(ao) f(bo) < 0. 
INPUT: Continuous function f, initial interval [do, bo], tolerance €, maximum 
number of iterations N. 
OUTPUT: A solution c (within the tolerance €), or a message of failure. 
For n = 0, 1,:::,N — 1 do: 
C= 3 (an + bn) 
If f(c) = 0 then OUTPUT c_ Stop. [Procedure completed] 
Else if f(a,)f(b,) < 0 then set a, +1 = dy, and b,+41 =. 
Else set dy, 41 = c, and by +4 = dy. 
If lan +41 = by+1| <elc| then OUTPUT c. Stop. [Procedure completed] 
End 
OUTPUT [ay, by] and a message “Failure”. Stop. 
[Unsuccessful completion; N iterations did not give an interval of length not 
exceeding the tolerance. | 
End BISECT 


Note that [ay, by] gives (ay + by)/2 as an approximation of the zero and (by — ay)/2 
as a corresponding error bound. 
(b) 0.739085; (c) 1.30980, 0.429494 
27.x2g = 1.5, xg = 1.76471,---, x7 = 1.83424 (6S-exact) 
29. 0.904557 (6S-exact) 


Problem Set 19.3, page 819 


1. Lo(x) = —2x + 19, Lyx) = 2x — 18, px(9.3) = Lo(9.3) - fo + L1(9.3) “fy 
= 0.1086 - 9.3 + 1.230 = 2.2297 


(x — 1.02)(x — 1.04) (x = 1) — 1.04) 
a ama ec 07 
(x — 1) — 1.02) 


- 0.9784 = x? — 2. + 2.580: 0.9943, 0. 
nos 00 0.9784 = x 580x + 2.580; 0.9943, 0.9835 


5. 0.8033 (error —0.0245), 0.4872 (error —0.0148); quadratic: 0.7839 (—0.0051), 
0.4678 (0.0046) 

7. po(x) = 1.1640x — 0.3357x?; —0.5089 (error 0.1262), 0.4053 (—0.0226), 
0.9053 (0.0186), 0.9911 (—0.0672) 

9. po(x) = —0.44304x? + 1.30896x — 0.023220, po(0.75) = 0.70929 
(5S-exact 0.71116) 

11. Lo a(x — I(x — 2)(x — 3), Ly = 5x(x — 2)(x — 3), Lo xx(x — 1)(x — 3), 
Lz = gx(x — 1)(x — 2); p(x) = 1 + 0.039740x — 0.335187x2 + 0.060645x°; 
p2(0.5) = 0.943654, p3(1.5) = 0.510116, p3(2.5) = —0.047991 

13. 2x7 — 4x + 2 

15. pa(x) = 2.1972 + (x — 9) - 0.1082 + (x — 9)(x — 9.5) - 0.005235 

17. r = —1.5, po(0.3) = 0.6039 + (—1.5) - 0.1755 + 3(—1.5)(—0.5) « (—0.0302) 
= 0.3293 


App. 2. Answers to Odd-Numbered Problems 


Problem Set 19.4, page 826 


9. [-1.39(x — 5)* + 0.58(x — 5)3]” = 0.004 at x = 5.8 (due to roundoff; 
should be 0). 
Lip l-axe ae 


13.1 —x7, -24¢-1-(—- 1% + 20-13, -14+ 24-2) + 5x — 2)? 


— 6(x — 2)% 
15.4+x72-—x3, -—8 — 2) — 5(x — 2)? + 5(x — 2)%, 
A + 32(x — 4) + 25(x — 4)? - 11 - 4)? 


A47 


17. Use the fact that the third derivative of a cubic polynomial is constant, so that g”” 


is piecewise constant, hence constant throughout under the present assumption. 


Now integrate three times. 
19. Curvature f’/(1 + f’”)?/? = f” if |f’| is small. 


Problem Set 19.5, page 839 


1. 0.747131, which is larger than 0.746824. Why? 
3.0.5, 0.375, 0.34375, 0.335 (exact) 
5. €9.5 ~ 0.03452 (€9.5 = 0.03307), €0.25 ~ 0.00829 (€9.25 = 0.00820) 
7. 0.693254 (6S-exact 0.693147) 
9. 0.073930 (6S-exact 0.073928) 
11. 0.785392 (6S-exact 0.785398) 
13. (0.785398126 — 0.785392156)/15 = 0.39792 - 10-® 


15. (a) Mz = 2, |KMo| = 2/(12n”) = 1079/2, n = 183. (b) f” = 24/x°, Ma = 24, 


|CM4| = 24/(180 - (2m)*) = 107°/2, 2m = 12.8, hence 14. 
17. 0.94614588, 0.94608693 (8S-exact 0.94608307) 
19. 0.9460831 (7S-exact) 
21. 0.9774586 (7S-exact 0.9774377) 
23. Setx =4$(t + 1), 0.2642411177 (10S-exact), 1 — 2/e 
25.x=3(t+ 1), dx = 5dt, 0.746824127 (9S-exact 0.746824133) 
27. 0.08, 0.32, 0.176, 0.256 (exact) 
29. 5(0.1040 — 5 - 0.1760 + 4+ 0.1344 — 4 + 0.0384) = 0.256 


Chapter 19 Review Questions and Problems, page 841 


17. 4.375, 4.50, 6.0, impossible 
19. 44.885 = s S 44.995 
21. The same as that of a. 


23. x = 20 + V398 = 20.00 + 19.95, x; = 39.95, x2 = 0.05, x2 = 2/39.95 


= 0.05006 (error less than | unit of the last digit) 
25.x =x*- 0.1, 0.1, 0.999, —0.99900399 
27. 0.824 
29. -x+x7, 2-1 4+3a@-1% -(- 1 
31. 0.26, Mz=6, M3=0, -002Se€0, 001 
33. 0.90443, 0.90452 (5S-exact 0.90452) 


35. (a) (0.43 — 2 - 0.23 + 0)/0.04 = 1.2, (b) (0.3? — 2 - 0.23 + 0.1)/0.01 = 1.2 (exact) 


A48 App. 2. Answers to Odd-Numbered Problems 


Problem Set 20.1, page 851 
1.x, = 7.3, xg = -3.2 3. No solution 5.x, = 2, 
=3 6 =9 —46.725 
7.| 0 9 .—13 =31.223 


0 0 —2.88889 —7.38689 
x1 = 3.908, x = —1.998, x3 = 2.557 


13, —-8 0 178.54 
9.| 0 6 13 137.86 


0 0 =16 253,12 
x1 = 6.78, x9 = —11.3, x3 = 15.82 


34 6.12 =2.72 0 
11.| 0 0 4.32 0 


0 0 0 0 
xy = tyarbitrary, xo = (3.4/6.12)t4, x3 =0 


a) 0 6 —0.329193 
13.)}0 -4 -3.6 —2.143144 


0 0 2.3. —0.4 
X 1 = 0.142856, x2 = 0.692307, x3 = —0.173912 


=] +3.) 2.5 0 —8.7 
0 22 1.5 =33 = 93 
15. 
0 0 —1.493182 —0.825 1.03773 
0 0 0 6.13826 12.2765 


x1 = 4.2, xo = 0, xg = —-18, x4 = 2.0 


Problem Set 20.2, page 857 


1 O|]/4 5| x.=-4 
1. ? 
3 1 0) —I x2 > 


— 
So 
So 
Ww 
\o 
lon 


tad 
N 
IN lp 


io) 

wo = 
—_ 
o Oo 
oO MN 
= oS 
oOo = 

(tt =m 

SO 
NF 
Io | 
Go Oo 
co fs 


= 
wo 
II 


Xg=1 
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0 

-1 2 0 ol/0 2 -1 O| x,=-3 
O10 0 3 -1] xg= 4 
4 


0 0 0 4 xg= 1 


13. No, since x! (—A)x = —x'Ax < 0; yes; yes; no 


=3.5 1.25 
15. 
3.0 —1.0 
584 104 —66 
1 
17. — | 104 20 —12 
36 


19, — 


Problem Set 20.3, page 863 


5. Exact 0.5, 0.5, 0.5 7.%1=2, x2=—-4, x3 =8 

9, Exact 2, 1, 4 

11. (a) x" = [0.49983 0.50001 0.500017], 
(b) x" = [0.50333 0.49985 0.49968] 

13.8, —16, 43, 86 steps; spectral radius 0.09, 0.35, 0.72, 0.85, approximately 

15. [1.99934 1.00043 3.99684]" (Jacobi, Step 5); [2.00004 0.998059 4.00072]" 
(Gauss-Seidel) 

19. V306 = 17.49, 12, 12 


Problem Set 20.4, page 871 


1.18, V110 = 10.49, 8, [0.125 -0.375 1 0 —0.75 O] 
3.5.9, V13.81 = 3.716, 3, 4[0.2 0.6 —2.1 3.0] 
5.5, V5, 1, [1 1 1 1 1) 7Zab+be+ca=0 


A50 App. 2. Answers to Odd-Numbered Problems 


9K =5°5 =25 1. k = (5 + V5)\(1 + 1/V5) = 6 + 2V5 
13. kx = 19- 13 = 247; ill-conditioned 
15. k = 20- 20 = 400; _ ill-conditioned 
17. 167 S 21-15 = 315 
19. [—2 ay", [— 144.0 184.0)", k = 25,921, extremely ill-conditioned 
21. Small residual [0.145 0.120], but large deviation of x. 
23.27, 748, 28,375, 943,656, 29,070,279 


Problem Set 20.5, page 875 


1. 1.846 — 1.038x 3. 1.48 + 0.09x 
5. 5 = 90t — 675, Vay =90km/hr 9. — 11.36 + 5.45x — 0.589x? 
11. 1.89 — 0.739x + 0.207x2 
13. 2.552 + 16.23x, —4.114 + 13.73x + 2.500x?, 2.730 + 1.466x 
— 1.778x? + 2.852x3 


Problem Set 20.7, page 884 


1.5,0,7; radii 6, 4, 6. Spectrum {—1, 4, 9} 
3. Centers 0; radii 0.5, 0.7, 0.4. Skew-symmetric, hence A = iw, —0.7 S pw S 0.7. 
5.2,3,8; radii 1 + V2,1, V2; actually (4S) 1.163, 3.511, 8.326 
7.11 = 100, tog = tg3 = 1 
9. They lie in the intervals with endpoints aj; = (n — 1)- 10°, Why? 
11. p(A) = Row sum norm||A||.. = max § |aj,| = max(|a;| + Gerschgorin radius) 
j 


k 
13. V/122 = 11.05 
15. V0.52 = 0.7211 
17. Show that AA' = A'A, 
19. 0 lies in no Gerschgorin disk, by (3) with >; hence det A = Ay--- Ay # 0. 


Problem Set 20.8, page 887 


1. g = 10, 10.9908, 10.9999; |e] = 3, 0.3028, 0.0275 

3.g+6=4+ 1.633, 4.786 + 0.619, 4.917 + 0.398 

5. Same answer as in Prob. 3, possibly except for small roundoff errors. 

7. gq = 5.5, 5.5738, 5.6018; |e] S 0.5, 0.3115, 0.1899; eigenvalues (48) 1.697, 
3.382, 5.303,5.618 

9.y = Ax = Ax, y'x =Ax'x, yy = A’x'x, 
e? S y'y/x'x — (y'x/x'x)? = A? — a2 =0 

11. g = 1,---, —2.8993 approximates —3 (0 of the given matrix), 
le] S 1.633,---, 0.7024 (Step 8) 


Problem Set 20.9, page 896 
0.98 —0.4418 0 
1.| —0.4418 0.8702 0.3718 
0 0.3718 0.4898 


App. 2. Answers to Odd-Numbered Problems AS51 
7 —3.6056 0 
3.| —3.6056 13.462 3.6923 
0) 3.6923 3.5385 
3 —67.59 0 0 
—67.59 143.5 45.35 0 
5. 
0 45.35 23.34 3.126 
0 0 3.126 —33.87 
7. Eigenvalues 16, 6, 2 
[ 11.2903 5.0173. 0 | 14.9028 —3.1265 0 | 15.8299 —1.2932 0 
—5.0173 10.6144 0.7499], | —3.1265 7.0883 0.1966], | —1.2932 6.1692 0.0625 
| 0 0.7499 2.0952 0 0.1966 2.0089 0 0.0625 2.0010 | 
9. Eigenvalues (4S) 141.4, 68.64, —30.04 
[141.1 4.926 0 [141.3 2.400 0 [141.4 1.166 0 | 
4.926 68.97 0.8691 |, 2.400 68.72 0.3797 |, 1.166 68.66 0.1661 
| 0 0.8691 —30.03 0 0.3797 —30.04 0 0.1661 —30.04 | 


Chapter 20 Review Questions and Problems, page 896 


15. [3.9 43 1.8)" 
17.[-2 0 5]' 
0.28193 —0.15904 —0.00482 

19.| —0.15904 0.12048 + —0.00241 
—0.00482 —0.00241 0.01205 
5.750 6.400 6.390 

21.| 3.600 |, | 3.559}, | 3.600 
0.838 1.000 0.997 
Exact: [6.4 3.6 1.0]! 
1.700 1.986 2.000 

23.| 1.180], | 0.999}, | 1.000 
4.043 4.002 4.000 
Exact: [2 1 4)" 

25.42, W674 = 25.96, 21 27. 30 


29.5 

33.5% =3 

37. Centers 15, 35, 90; 
39. Centers 0, —1, —4; 


35. 1.514 + 1.129x — 


radii 9, 6, 7, respectively; 


31. 115 - 0.4458 = 51.27 


0.214x? 


radii 30, 35, 25, respectively. Eigenvalues (3S) 2.63, 40.8, 96.6 
eigenvalues 0, 4.446, —9.446 


A52 


App. 2. Answers to Odd-Numbered Problems 


Problem Set 21.1, page 910 


17. 


19. 


.y = 5e°?" 0.00458, 0.00830 (errors of y5, ¥10) 

-y =x — tanhx (sety —x =u), 0.00929, 0.01885 (errors of y5, v1) 
-y =e", 0.0013, 0.0042 (errors of ys, 19) 

-y=1/d- x” /2), 0.00029, 0.01187 (errors of y5, y10) 

. Errors 0.03547 and 0.28715 of ys and yy9 much larger 


.y = 1/(1 — x?/2); error —1078, —4- 1078, ---,-6-107%, +9- 107%; 
e = 0.0002/15 = 1.3 - io (use RK with h = 0.2) 
.y =tanx; error 0.83 - 1077, 0.16 - 107%, ---, —0.56 - 107°, +0.13 - 107° 
-y =3cosx —2 cos” Xx; error - 10’: 0.18, 0.74, 1.73, 3.28, 5.59, 9.04, 14.3, 22.8, 
36.8, 61.4 
y’ = 1/22 — x); error - 10%: 0.2, 3.1, 10.7, 23.2, 28.5, —32.3, —376, —1656, 
—3489, +80444 
Errors for Euler-Cauchy 0.02002, 0.06286, 0.05074; for improved Euler—Cauchy 


—0.000455, 0.012086, 0.009601; for Runge-Kutta. 0.0000011, 0.000016, 0.000536 


Problem Set 21.2, page 915 


1. 


3. 


13. 
15. 


y=e", yt = 1.648717, ys = 1.648722, e5 = —3.8- 1078, 

Vin = 2718276, yo — 271824, |e = —1:8 «10° 

y =tanx, ya,-+:, y4o (error - 10°) 0.422798 (—0.49), 0.546315 (—1.2), 
0.684161 (—2.4), 0.842332 (—4.4), 1.029714 (—7.5), 1.260288 (— 13), 
1.557626 (—22) 


. RK error smaller in absolute value, error - ior = 0.4, 0.3, 0.2, 5.6 


(for x = 0.4, 0.6, 0.8, 1.0) 


Ly = 1/(4 + e738"), y4,-++, yx9 (error - 10°) 0.232490 (0.34), 0.236787 (0.44), 


0.240075 (0.42), 0.242570 (0.35), 0.244453 (0.25), 0.245867 (0.16), 0.246926 (0.09) 


-y = exp (x3) — 1, ya,-**, y4o (error - 10”) 0.008032 (—4), 0.015749 (—10), 


0.027370 (—17), 0.043810 (—26), 0.066096 (—39), 0.095411 (—54), 
0.133156 (—74) 

y = exp (x”). Errors - 10° from x = 0.3 to 0.7: —5, —11, -19, —31, —41 
(a) 0, 0.02, 0.0884, 0.215848, y4 = 0.417818, ys = 0.708887 (poor) 

(b) By 30-50% 


Problem Set 21.3, page 922 


1. 


3. 


5. 


7. 


9. 
11. 


yy = —e 2" + 4e™, yo = —e 2" + e*; errors of yy (of ye) from 0.002 to 0.5 
(from —0.01 to 0.1), monotone 
yi =o, ve ay y=y, = 1, 0.99, 0.97, 0.94, 0.9005, error 

0.005, —0.01, —0.015, —0.02, —0.0229; exact y = cos 3x 
Y1I=ya yo=yitx, y1(0)=1, y2(0)=-2, y=yp=e*-x, y=08 
(error 0.005), 0.61 (0.01), 0.429 (0.012), 0.2561 (0.0142), 0.0905 (0.0160) 
By about a factor 10°. €,,(y1) - 10® = —0.082,---, —0.27, 
é,(yo) * 10° = 0.08,--+, 0.27 
Errors of y; (of yg) from 0.3 - 107° to 1.3 - 107° (from 0.3 - 107° to 0.6 - 107°) 
(y1, y2) = (0, 1), (0.20, 0.98), (0.39, 0.92), -:-, (—0.23, —0.97), (—0.42, —0.91), 
(—0.59), (—0.81); continuation will give an “ellipse.” 
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Problem Set 21.4, page 930 


3. —3u4, + Uy2 = —200, uy, — 3uy2 = —100 

5. 105, 155, 105, 115; Step 5: 104.94, 154.97, 104.97, 114.98 

7.0, 0, 0, 0. All equipotential lines meet at the corners (why?). 
Step 5: 0.29298, 0.14649, 0.14649, 0.073245 

9. 0.108253, 0.108253, 0.324760, 0.324760; Step 10: 0.108538, 0.108396, 
0.324902, 0.324831 

11. (a) wy, = —uy2 = —66. (b) Reduce to 4 equations by symmetry. 

uy. uz u15 U3Z5 92.92, ug, — —ua5 = — 87.45, 
uj42 uz32 uy4 Uu34 64.22, ugg — —uo4 = —53.98, 


W313 = U23 = U33 = 0 

13. uyg = Ugo = 31.25, Ug, — Ugg = 18.75, Ujk = 25 at the others 

15. Ug, — uo3g = 0.25, uyg — Ug. = —0.25, Ujk = 0 otherwise 

17. V3, U4, = Ue, = 0.0849, uy9 = ue = 0.3170. (0.1083, 0.3248 are 4S-values 
of the solution of the linear system of the problem.) 


Problem Set 21.5, page 935 


5; uy, = 0.766, ug, — 1.109, uy2 = 1.957, uo2 — 3.293 
7. A, as in Example 1, right sides —220, —220, —220, —220. 
Solution uy, = Ua, = 125.7, ug, — ug = 157.1 


13. —4u41 + ug, + uy2 = =3, uy4y — 4us1 + ugg = —12, uyy — 4uyj9 ++ uo = 0, 
2u91 + 2u42 = 12ug9 = —14, Uy. u92 2, uo, 4, uy2 i 
Here —4¢ = —$(1 + 2.5) with $ from the stencil. 


15. b = [—200, —100, —100, 0"; u11 = 73.68, w21 = Wy2 = 47.37, U2g = 15.79 (48) 


Problem Set 21.6, page 941 


5. 0, 0.6625, 1.25, 1.7125, 2, 2.1, 2, 1.7125, 1.25, 0.6625, 0 
7. Substantially less accurate, 0.15, 0.25 (t = 0.04), 0.100, 0.163 (¢ = 0.08) 
9. Step 5 gives 0, 0.06279, 0.09336, 0.08364, 0.04707, 0. 
11. Step 2: 0 (exact 0), 0.0453 (0.0422), 0.0672 (0.0658), 0.0671 (0.0628), 0.0394 
(0.0373), 0 (0) 
13. 0.3301, 0.5706, 0.4522, 0.2380 (t = 0.04), 0.06538, 0.10603, 0.10565, 0.6543 
(t = 0.20) 
15. 0.1018, 0.1673, 0.1673, 0.1018 (t = 0.04), 0.0219, 0.0355, ---(t = 0.20) 


Problem Set 21.7, page 944 


1. u(x, 1) = 0, —0.05, —0.10, —0.15, —0.20, 0 
3. For x = 0.2, 0.4 we obtain 0.24, 0.40 (¢ = 0.2), 0.08, 0.16 (tf = 0.4), 
0.08, —0.16 (t = 0.6), etc. 
5. 0, 0.354, 0.766, 1.271, 1.679, 1.834,--- (tf = 0.1); 0, 0.575, 0.935, 1.135, 1.296, 
1.357,:-+ (t = 0.2) 
7. 0.190, 0.308, 0.308, 0.190, (3S-exact: 0.178, 0.288, 0.288, 0.178) 
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Chapter 21 Review Questions and Problems, page 945 


17. y = e”, 0.038, 0.125 (errors of ys and yj) 

19. y = tan x; 0 (0), 0.10050 (—0.00017), 0.20304 (—0.00033), 0.30981 (—0.00048), 
0.42341 (—0.00062), 0.54702 (—0.00072), 0.68490 (—0.00076), 
0.84295 (—0.00066), 1.0299 (—0.0002), 1.2593 (0.0009), 1.5538 (0.0036) 

21. 0.1003346 (0.8 - 10~) 0.2027099 (1.6 « 107“), 0.3093360 (2.1 - 107%), 
0.4227930 (2.3 - 1077), 0.5463023 (1.8 - 107”) 

23. y = sinx, yog = 0.717366, y1.9 = 0.841496 (errors —1.0 - to”. 
—2.5 + 107°) 

25. y1 =yo, ys =x"y1, y =y1 = 1,1, 1, 1.0001, 1.0006, 1.002 

27. y1 = yo, yo = 2e*—y1, y=e”—cosx, y = y, = 0,0.241, 0.571,---; 
errors between 10~® and 107° 

29. 3.93, 15.71, 58.93 

31. 0, 0.04, 0.08, 0.12, 0.15, 0.16, 0.15, 0.12, 0.08, 0.04, 0 (t = 0.3. 3 time steps) 

33. u(Pi1) = u(P31) = 270, u(Po1) = u(Pi3) = u(Pe3) = u(P33) = 30, 
u(Pi2) = u(P32) = 90, u(Pe2) = 60 


35. 0.043330, 0.077321, 0.089952, 0.058488 (¢ = 0.04), 0.010956, 0.017720, 0.017747, 


0.010964 (¢ = 0.20) 


Problem Set 22.1, page 953 


3. f(x) = 2(x1 - 1)? + (xg + 2)? — 6; Step 3: (1.037, —1.926), value —5.992 
9. Step 5: (0.11247, —0.00012), value 0.000016 


Problem Set 22.2, page 957 


7. No 
9. x3, x4 1s the unused time on Mj , Mo, respectively. 
11. f(2.5, 2.5) = 100 
13. f(—3, 3) = 1983 
15. f(9, 6) = 360 
17. 0.5x1 + 0.75x2 S 45 (copper), 0.5x1 + 0.25x S 30, f = 120x1 + 100xo, 
tmax = f(45, 30) = 8400 
19. f = x + x9, 2x1 + 3xq S 1200, 4x4, + 2x9 S 1600, frnax = f(300, 200) = 500 
21. x1/3 + x2/2 S 100, 11/3 + x2/6 S 80, f = 150% + 100x2, fmax = f(210, 60) = 
37,500 


Problem Set 22.3, page 961 
3. f(120/11, 60/11) = 480/11 
5. Eliminate in Column 3, so that 20 goes. fmin = f(0, ) = —10. 
7. fax = F(t» 0, "{05'» 0) = 7 
9. fmax = 6 on the segment from (3, 0, 0) to (0, 0, 2) 
11. We minimize! The augmented matrix is 


1 1.8 2.1 0 0 0 
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The pivot is 600. The calculation gives 


i 0 fh O -wdo 40 Row 1 — gg Row 3 
T, =| 0 0 2 1 ~-a ge Row 2 — agp Row 3 
0 600 500 O 1 3900 Row 3 


The next pivot is 2 The calculation gives 


1 0 0 aa a a Row | — $3 Row 2 
T= | 6 0 F 1 —a0 1 Row 2 
0 600 o 290 2 2400 Row 3 — 732? Row 2 


Hence —f has the maximum value —13.5, so that f has the minimum value 13.5, at 
the point 


2400 105/2 
(X41, X2) a 600 > 35/2 _ (4, 3). 


13. finax = f(5, 4, 6) = 478 


Problem Set 22.4, page 968 


1. f(6, 3) = 84 
3. f(20, 20) = 40 
5. f(10, 5) = 5500 
7. fl, 1,0) = 13 
9. f(4,0,5) =9 


Chapter 22 Review Questions and Problems, page 968 


9. Step 5: [0.353 —0.028]". Slower. Why? 
11. Of course! Step 5: [-1.003  1.897]" 
17. f(2, 4) = 100 

19. f(3, 6) = —54 


Problem Set 23.1, page 974 


0 1 0 
0 0 0 0 
9.| 0 0 1 11. 
1 0 0 0 
1 0 0 
0 0 0 0 
0 1 1 15. ©—_® 
13.| 0 0 1 
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17. If G is complete. 
Edge 


ey e2 &3 4 


1 =1 =! | =1 


19, 


Vertex 


Problem Set 23.2, page 979 


1.5 3.4 
5. The idea is to go backward. There is a v;z_1 adjacent to v;, and labeled k — 1, etc. 
Now the only vertex labeled 0 is s. Hence A(Ug) = 0 implies Ug = s, so that 


Vg — Vy — *** — Ugp—1 — Ux iS a path s vy, that has length k. 
15. Delete the edge (2, 4). 
17. No 


Problem Set 23.3, page 983 


1. (1, 2), (2, 4), (4, 3); Le = 12, Lg = 36, La = 28 

5. (1, 2), (2, 4), 3, 4), 3,5); Le = 2,L3 = 4, L4 = 3,L5 = 6 

7. (1, 2), (2,4), G,4); Le = 10, Lg = 15, L4 = 13 

9. (1, 5), (2, 3), (2, 6), (3, 4), 3,5); Le = 9,L3 = 7,L4 = 8, L5 = 4, Le = 14 


Problem Set 23.4, page 987 


2. 
1.“ \4-3-5 L=10 
1 
ji 
3.5-3-6¢ L=19 
2-4 
y2 
5.1 a: 
Nae 
‘5 
9. Yes 
2 
11.1-3-4¢ L= 38 
5-6 


13. New York—Washington—Chicago—Dalles—Denver—Los Angeles 
15. G is connected. If G were not a tree, it would have a cycle, but this cycle would 
provide two paths between any pair of its vertices, contradicting the uniqueness. 


App. 2. Answers to Odd-Numbered Problems A57 


19. If we add an edge (u, v) to T, then since T is connected, there is a path u—>v in T 
which, together with (u, v), forms a cycle. 


Problem Set 23.5, page 990 


1. If G is a tree. 
3. A shortest spanning tree of the largest connected graph that contains vertex 1. 
7. (1, 4), C, 3), C, 2), (2, 6), (3,5); L = 32 
9. (1, 4), (4, 3), (4, 2), 3,5); L = 20 
11. C1, 4), (4, 3), 4,5), , 2); L = 12 


Problem Set 23.6, page 997 


1. {3,6}, l11+3= 14 

3. {4,5,6}, 10+5 + 13 = 28 

5. {3,6,7}, 8+4+4= 16 

7.5 = {1,4}, 8+6= 14 

9. One is interested in flows from s to t, not in the opposite direction. 

13. Ayo = 5; Aoa = 8, Ags = 2; Aye = 5, Aos = 3; Ay = 4, A35 =9 
Py 1-2-4-5,Af=2; Py 1-2-5,Af=3; Pz 1-3-5, Af=4 

15.1 -2-—5,Af=2; 1-4-2 -—5,Af= 2, ete. 

li .fia. jon = 8) fa fos 3; hoffe 4. foo 15, FRe4r 13 = 77, 
Ff = 17 is unique. 


19. For instance, fi2 = 10, fa=fas5=7, fis =fo5=5, f5=3, foo = 2, 
f=34+54+7= 15, f= 15 is unique. 


Problem Set 23.7, page 1000 


3. (2, 3) and (5, 6) 
5. By considering only edges with one labeled end and one unlabeled end 
7.1-—2-—5,A;=2; 1-4-2-5,A,=1; f=6+2+1=9, where 6 is 
the given flow 
9.1-2-4-—-6,A;=2; 1-3-5-6,A;=1; f=4+2+1=7, where 4 
is the given flow 
15. S = {1,2,4,5}, T= {3,6}, cap(S,7) = 14 


Problem Set 23.8, page 1005 


1. No 3. No 
5. Yes, S = {1, 4,5, 8} 
7. Yes, S = {1, 3,5} 11.1-2-3-7-5-4 


13. 1 — 2-3 —7-—5 — 4is augmenting and gives 1 — 2 — 3 — 7 — 5 — 4and (1, 2), 
(3, 7),(5, 4) is of maximum cardinality. 

15.1 —4-—3-—6-—7-8 is augmenting and gives |-4—3-—6-—7-—8 and 
(1, 4), (G, 6), (7, 8) is of maximum cardinality. 

19. 3 21.2 

23. 3 25. Kq 
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Chapter 23 Review Questions and Problems, page 1006 
0 0) 1 1 
0 0) 1 1 
1 1 0 0 


1 1 0 0 


13. To vertex 1 2 3 4 
From vertex 1 lo I 0 l 


@) (3) 
17. 
Vertex Incident Edges 
1 (, 2), (1, 4) 
2 (2, 1), (2, 4) 
3 (3, 4) 
4 (4, 1), (4, 2), (4, 3) 


19. (1, 2), (1, 4), (2,3); Le = 2,L3 =5,L4 =5 
23. (1, 6), (4, 5), (2, 3), (7, 8) 


Problem Set 24.1, page 1015 


1. gr, = 19, au = 20, gy = 20.5 3. gz, = 138, qu = 144, gy = 154 
5g = 199,96 =20lge = 201 Tee = 13g" = TA = 1,45 

9. gp = 89.9, qu = 91.0, gy = 91.8 11. X = 19.875, 5 = 0.835, IQR = 1.5 
13. X = 144.67, s = 8.9735, IQR = 16 15. X = 1.355, s = 0.136, IQR = 0.15 
17. 3.54, 1.29 


Problem Set 24.2, page 1017 
1. 2? outcomes: RRR, RRL, RLR, LRR, RLL, LRL, LLR, LLL 


3. 62 = 36 outcomes (1, 1), (1, 2),-+-, (6, 6), first number (second number) referring 
to the first die (second die) 
5. Infinitely many outcomes H TH TTH TITH .::- (H = Head,T = Tail) 
7. The space of ordered pairs of numbers 
9.10 outcomes: D ND NND_ --- NNNNNNNNND 
11. Yes 


17. AUB = B implies A C B by the definition of union. Conversely. A C B implies 
that AUB = B because always B C AUB, and if A C B, we must have equality 
in the previous relation. 
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Problem Set 24.3, page 1024 


1. 1 — 4/216 = 98.15%, by Theorem | 
3. (a) 0.93 = 72.9%, (b) 2y, - 83 - 88 = 72.65% 

5.3 

7. Small sample from a large population containing many items in each class we are 


interested in (defectives and nondefectives, etc.) 


498 | 497 , 496 495 494 _ 
9. 500 * 499 ° ae 497 * 496 ~ 0.98008 


11. (a) 390 * 199 = 24.874%, (b) 300 * 199 + 300 * 199 = 50.25%, (c) same as (a). 
(a) + (b) + (c) = 1. Why? 

13. 1 — 0.963 = 11.5% 

15. 1 — 0.8754 = 0.4138 < 1 — 0.752 = 0.4375 < 0.5 (c<b<a) 

17. A = BU(ANB*), hence P(A) = P(B) + P(ANB‘) 2 P(B) by disjointedness of B 
and ANB* 


Problem Set 24.4, page 1028 


1. In 10! oe oa ways 

3.2-2.4.3,2.124,38.2.1,.2,1_ 4 _2,1_1 
6 5 4 3 “3 i 6 5 4 3 2 1 6! 6 5 15 

5. (':) (3) (8) = 18,000 7. 210, 70, 112, 28 

9. In 6!/6 = 120 ways 11.9+8 = 72 


13. (b) 1/(12n) 
15. P (No two people have a birthday in common) = 365 + 364 --- 346/3657° = 0.59. 
Answer: 41%, which is surprisingly large. 


Problem Set 24.5, page 1034 


1. k = 35 by (©) 

3. k = by (10), POS X S2) =5 

5. No, because of (6) 

Tk = zoo because of (6) and 1 + 8 + 27 + 64 = 100 
9.k = 5; 50% 

11. 0.52 = 12.5% 

13. F(x) = 0 if x < -1, F@®) =3(¢ + 1) if -1 Sx <0 
F(x) = 1—3(x — 17 ifO Sx <1, FQ) =1lifxS1 
Answer: 500 cans, P = 0.125, 0 

15.X >b,X 2b,X <c,X =c, ete. 


Problem Set 24.6, page 1038 


1. k=o.6=3.0 =5 3. «w= 17, 07 = 77/3; cf. Example 2 
5. =5,0° =45 dG x 2,07 4 

9. 750, 1, 0.002 11. c = 0.073 

13. $643.50 15. 3, 35. (X — $)V20 


17. X = Product of the 2 numbers. E(X) = 12.25, 12 cents 
19.(0 + 1+3+3+8+4 1+ 27)/8 = 54/8 = 6-75 
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Problem Set 24.7, page 1044 


3. 38% 

5. 2 0.5°, 0.03125, 0.15625, 1 — f(0) = 0.96875, 0.96875 

7. 0.265 

9. f(x) = 0.5%e7 9? /x!, (0) + fC) = e7°9(1.0 + 0.5) = 0.91. Answer: 9% 
11. 134% 
13. 42%, 47.2%, 10.5%, 0.3% 


15. 1 — e~°? = 18% 


Problem Set 24.8, page 1050 


1. 0.1587, 0.5, 0.6915, 0.6247 3. 45.065, 56.978, 2.022 
5. 15.9% 7. 31.1%, 95.4% 
9, About 58% 11. t = 1084 hours 


13. About 683 (Fig. 521a) 


Problem Set 24.9, page 1059 
ree 3.3.4.3 
5. fo(y) = 1/(B2 — ag) if ag << y < Bo 
7. 27.45 mm, 0.38 mm 


11. 25.26 cm, 0.0078 cm 13. 50% 
15. The distributions in Prob. 17 and Example 1 
17. No 


Chapter 24 Review Questions and Problems, page 1060 


11. 0; = 110, Oy = 112, 0y = 115 

13. ¥ = 111.9, 5 = 4.0125, s* = 16.1 

21. Xmin = Xj = Xmax. Sum over j from 1. 
17. xX = 6,5 = 3.65 

19. f(x) = (79)0.0370.975°-* = 1.577 15/21 


Xx 
MifG) = 2 = 1,3) 23. 1,35 
25. 0.1587, 0.6306, 0.5, _~—«0.4950 


Problem Set 25.2, page 1067 


n 
1. In Example 1, u = 0 so ps xj = 0.0 In €/a€ = 0 and @” is as before. 
j=l 
3. € = er Felsen!) On /dw = —n + (xy +++ + xn)/p = 0, 
np = nx, P= X = 15.3 
5.1 = pra = oy =p = k/n, k = number of successes in n trails 
7. 7/12 
9.1 = f = pl — p)" 1h, ete., p = 1/x 
11. 6 = n/Sx; = 1/x 
13.6=1 
15. Variability larger than perhaps expected 
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Problem Set 25.3, page 1077 


3. 
7. 


9. 
11. 


13. 
15. 


17. 
19. 


Shorter by a factor V2 5. 4, 16 

c = 1.96,x = 126,57 = 126- 674/800 = 106.155, k = cs/Vn = 0.714, 
CONF  95{125.3 S w & 126.7}, CONFo 95{0.1566 S p S 0.1583} 

CONF  99{ 63.72 S pw & 66.28} 

n—1=5,F(c) = 0.995, c = 4.03, ¥ = 9533.33, s” = 49,666.67, 

k = 366.66 (Table 25.2), CONFo .99{9166.7 = w S 9900} 

CONF 95{0.023 S o” S 0.085} 

n — | = 99 degrees of freedom. F(cy) = 0.025, cy = 74.2, F(cg) = 0.975, 
co = 129.6. Hence ky = 12.41, kg = 7.10. CONFo 95 {7.10 S o7 S 12.41}. 
CONF 95{0.74 S o? S 5.19} 

Z = X + Yis normal with mean 105 and variance 1.25. 

Answer: P(104 3 Z = 106) = 63% 


Problem Set 25.4, page 1086 


17. 


.t = (0.286 — 0)/(4.31/V7) = 0.18 < c = 1.94; accept the hypothesis. 

. c = 6090 > 6019: do not reject the hypothesis. 

.o”/n = 1.8, c = 57.8, accept the hypothesis. 

. be < 58.69 or p > 61.31 

. Alternative 4 # 5000, t = (4990 — 5000)/(20/V/50) = —3.54 <c = —2.01 


(Table A9, Appendix 5). Reject the hypothesis uw = 5000 g. 


no difference 


Problem Set 25.5, page 1091 


. LCL = 1 — 2.58 - 0.02/2 = 0.974, UCL = 1.026 

.27 

. Choose 4 times the original sample size 

. 2.58V/0.0004/V2 = 0.036, LCL = 3.464, UCL = 3.536 

. LCL = np — 3V np — p), CL = np, UCL = np + 3V np — p) 

. In about 30% (5%) of the cases 

. LCL = pw — 3V p is negative in (b) and we set LCL = 0, CL = p = 3.6, 


UCL = p + 3V pu = 9.3. 


Problem Set 25.6, page 1095 


1. 
5. 
9. 


13. 


15. 


0.9825, 0.9384, 0.4060 3. 0.8187, 0.6703, 0.1353 
e291 + 256), P(A; 1.5) = 94.5, a = 5.5% 7. 19.5%, 14.7% 
(=e) + 00 = ey" i, 2 3 ha AY 


xX 


9 
100 
yy ( )ou2® 0.8810°-* = 22% (by the normal approximation) 
x=0 


(1 — 6), [A — 6)°-4]’ = 0,6 = §, AOQL = 6.7% 


Aé1 


. Two-sided. t = (0.55 — 0)/V0.546/8 = 2.11 < c = 2.37 (Table A9, Appendix 5), 


. 19 + 1,07/0.8? = 29.69 < c = 30.14 (Table Al0. Appendix 5), accept the 
hypothesis 
By (12), to = V16(20.2 — 19.6)/V0.16 + 0.36 > c = 1.70. Assert that B is better. 


Nin 
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Problem Set 25.7, page 1099 


3. x2 = (40 — 50)?/50 + (60 — 50)?/50 = 4 > c = 3.84; no 

5. x2 = 18 > 11.07; yes 

7. x0 = 10.264 < 11.07; yes 

9. 42 even digits, accept. 
(355 — 358.5)" (123 — 119.5)? 

358.5 . 119.5 

freedom, 95%) 

15. Combining the last three nonzero values, we have K — r — 1 = 9(r = 1 since we 
estimated the mean, tae =~ 3.87). xo = 12.8 <c = 16.92. Accept the hypothesis. 


x 
ow 
| 


13. x2 = 0.137 < c = 3.84 (1 degree of 


Problem Set 25.8, page 1102 


3. 6)° +8: 5)° = 3.5% is the probability that 7 cases in 8 trials favor A under the 
hypothesis that A and B are equally good. Reject. 
5. (4)8(1 + 18 + 153 + 816) = 0.0038 
7.% = 9.67, s = 11.87. to = 9.67/(11.87/V/15) = 3.16 > c = 1.76 (a = 5%). 
Hypothesis rejected. 
9. Hypothesis @ = 0. Alternative @ > 0,x = 1.58, 
t= V10- 1.58/1.23 = 4.06 > c = 1.83 (a = 5%). Hypothesis rejected. 
11. Consider y; = x; — Ho. 
13. n = 8; 4 transpositions, P(T = 4) = 0.007. Assert that fertilizing increases yield. 
15. P(T S 2) = 2.8%. Assert that there is an increase. 


Problem Set 25.9, page 1111 


1. y = 0.98 + 0.495x 3. y = —11,457.9 + 43.2x 
5.y = —10 + 0.55x 7. y = 0.5932 + 0.1138x, R = 1/0.1138 
9. y = 0.32923 + 0.00032x, (66) = 0.35035 
13. c = 3.18 (Table A9), ky = 43.2, go = 54,878, K = 1.502, 
CONFo 95{41.7 S ky S 44.7}. 
15. y — 1.875 = 0.067(x — 25), 3s2 = 500, go = 0.023, K = 0.021, 
CONF 95{0.046 = ky = 0.088} 


Chapter 25 Review Questions and Problems, page 1111 


15. fi = 20.325, 6? = ()s” = 3.982 17. CONF 99{27.94 S pw S 34.81} 

19. c = 14.74 > 14,5, reject wo; (14.74 — 14.50)/V0.025) = 0.9353 

21. 2.58 - V0.00024/V/2 = 0.028, LCL = 2.722, UCL = 2.778 

23.a = 1—-— (1 — 0)§ = 5.85%, when 6 = 0.01. For 0 = 15% we obtain 
B=a- 0)° = 37.7%. If n increases, so does a, whereas B decreases. 

25. y = 3.4 — 1.85x 


ee 


A3.1 Formulas for Special Functions 


For tables of numeric values, see Appendix 5. 
Exponential function e” (Fig. 545) 
e = 2.71828 18284 59045 23536 02874 71353 
(1) evet = er, eed =e, (e”)¥ = ev 
Natural logarithm (Fig. 546) 
(2) In xy) = Inx + Iny, In (x/y) = Inx — Iny, In (&*) = alnx 
In x is the inverse of e”, and e™* = x, eM * = MO = I/x, 
Logarithm of base ten log;9x or simply log x 


(3) logx = M Inx, M = loge = 0.43429 44819 03251 82765 11289 18917 
1 1 
(4) Inx = u log x, 7 In 10 = 2.30258 50929 94045 68401 79914 54684 


log x is the inverse of 10”, and 10!°2* = x, 107108” = I/x. 


Sine and cosine functions (Figs. 547, 548). In calculus, angles are measured in radians, 
so that sin x and cos x have period 27. 
sin x is odd, sin (—x) = —sin x, and cos x is even, cos (—x) = cos x. 


-2 0) 2 x 
Fig. 545. Exponential function e* Fig. 546. Natural logarithm In x 
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Fig. 547. sin x Fig. 548. cos x 


1° = 0.01745 32925 19943 radian 
1 radian = 57° 17’ 44.80625” 
= 57.29577 95131° 
(5) sin? x + cos*x = 1 
sin (x + y) = sinx cos y + cos x siny 


sin (x — y) = sinx cos y — cos x sin y 


(6) 
cos (x + y) = cosx cos y — sin x sin y 
cos (x — y) = cosx cos y + sin x sin y 
(7) sin 2x = 2 sin x cos x, cos 2x = cos? x — sin? x 
‘ 7 T 
sin x = cos [x = cos x 
(8) - (a 
cosx = sin {x + —] =sin{— —-x 
2 2 
(9) sin (7 — x) = sin x, cos (7 — x) = —cosx 
(10) cos? x = $(1 + cos 2x), sin? x = (1 — cos 2x) 
sin x sin y = 3[—cos (x + y) + cos (x — y)] 
(11) cos x cos y = 3[cos (x + y) + cos (x — y)] 
sin x cosy = 3[sin (x + y) + sin(@ — y)] 
. . _ utu u—vU 
sinu + sinv = 2 sin cos 
2 2 
u+uvu u—vU 
(12) cos u + cosu = 2 cos 5 cos 5 
_utv ., u-v 
cosU — cosu = 2 sin sin 
2 2 


sin 6 


(13) A cosx + B sinx = VA? + B? cos (x + 8), tan 6 = 


sin 6 


B 
cos 6 A 
A 
B 


(14) Acosx+ Bsinx = VA? + B? sin(x + 8), tan 5 = ; 
cos 
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it 


tan x 
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Tangent, cotangent, secant, cosecant (Figs. 549, 550) 


sin x cos x 
(15) tanx = ; cotx = — 
cos x sin x 
tanx + tany 
(16) tan(x + y) = 


1 — tan x tan y 


a 2n 
Fig. 550. cot x 
1 1 
sec x = ; cscx = = 
COS x sin x 
tan x — tan y 
tan (x — y) = 


1 + tan x tan y 


Hyperbolic functions (hyperbolic sine sinh x, etc.; Figs. 551, 552) 


(17) 


(18) 


(19) 
(20) 


(21) 


sinh x = 3(e” — e7”), 


cosh x + sinhx = e”, 


cosh x = 3(e7 + e~*) 


cosh x — sinhx = e* 


cosh? x — sinh? x = 1 


sinh? x = 3(cosh 2x — 1), 


Fig. 551. 


Ae 
sinh x (dashed) and cosh x 


cosh? x = $(cosh 2x + 1) 


AL 
tanh x (dashed) and coth x 


Fig. 552. 
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sinh (x + y) = sinh x cosh y + cosh x sinh y 
(22) 
cosh (x + y) = cosh x cosh y + sinh x sinh y 
tanh x + tanh y 
(23) tanh (x + y) = 


1 + tanh x tanh y 


Gamma function (Fig. 553 and Table A2 in App. 5). The gamma function I'(q) is defined 
by the integral 


(‘e) 


(24) T(a) = | ety) dt (a > 0), 


0 


which is meaningful only if a > 0 (or, if we consider complex a, for those a whose real 
part is positive). Integration by parts gives the important functional relation of the gamma 
function, 


(25) T(a + 1) = al (a). 


From (24) we readily have ['(1) = 1; hence if @ is a positive integer, say k, then by 
repeated application of (25) we obtain 


(26) Tk+ 1) =k! (k=0,1,--°). 


This shows that the gamma function can be regarded as a generalization of the elementary 
factorial function. [Sometimes the notation (a — 1)! is used for [(@), even for noninteger 
values of a, and the gamma function is also known as the factorial function. ] 

By repeated application of (25) we obtain 


oy ee De)... Nate vy 
nen Se dae Nat He< (e  B 
T(a) 
5 
I 
| 
| been 
! f 
| = 
4 =2: 2 4 a 
| -2 
1 | I\r- 


| 
Fig. 553. Gamma function 
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and we may use this relation 


27 E a #0, -1,-2 
( ) (a) cae aa ra 1) ra (a +4 k) (a > ? ? ), 
for defining the gamma function for negative a (# —1, —2,---), choosing for k the 


smallest integer such that a + k + 1 > 0. Together with (24), this then gives a definition 
of (a) for all a not equal to zero or a negative integer (Fig. 553). 

It can be shown that the gamma function may also be represented as the limit of a 
product, namely, by the formula 


: n! n® 
ee Me ae ara ee 


From (27) or (28) we see that, for complex a, the gamma function I'(@) is a meromorphic 
function with simple poles at a = 0, —1, -2,---. 

An approximation of the gamma function for large positive a is given by the Stirling 
formula 


a 


(29) T(a + 1) = V27a (<) 


where e is the base of the natural logarithm. We finally mention the special value 
(30) 1) = Vn. 


Incomplete gamma functions 


x co 


(31) P(a, x) = | et°1 dt, O(a, x) = | e't*-1 dt (a > 0) 
0 G 
(32) I'(a) = P(a, x) + O(a, x) 
Beta function 
1 
(33) B(x, y) = [eM = 99" at GS 0,750) 
0 


Representation in terms of gamma functions: 


P@PO) 


(34) BOY) = TEs y 


Error function (Fig. 554 and Table A4 in App. 5) 


Xe 


2 2 
35 erfx = —= | e' dt 
a Fs, 


7 2, x8 x xt 
(36) erf x x 13 + 215 317 + tee 
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erf x 


0.5 


fespeeggp pet] peepee ope [psf ea | ep feita ee ere eee ep eae] 
-2 -l 1 2 


-1 
Fig. 554. Error function 


erf (©) = 1, complementary error function 


2 re as 
37 erfex = 1—erfx = | e° dt 
ee Vas 


Fresnel integrals’ (Fig. 555) 


(38) C(x) = | cos (¢?) dr, S(x) = | sin (t?) dt 
0 0 


C(~) = V w/8, S(%) = V 7/8, complementary functions 


c(x) = i C(x) =| cos (t”) dt 
7 CO 
so) = [fg — SQ) = | sin (12) dt 


Sine integral (Fig. 556 and Table A4 in App. 5) 


(39) 


; ” sin t 
(40) Si(x) = | a dt 
(0) 


0.5 


ee 
ede Pe ae et a ep td 
Hl 2 3 4 


Fig. 555. Fresnel integrals 


AUGUSTIN FRESNEL (1788-1827), French physicist and mathematician. For tables see Ref. [GenRef1]. 
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L l 
5 10 x 


Fig. 556. Sine integral 


Si(~@) = 7/2, complementary function 


41 ; = Si aig 
(41) sit) = i(x) ae 
Cosine integral (Table A4 in App. 5) 
; ” cos t 
(42) ci(x) = | — dt (x > 0) 
mt 

Exponential integral 

lat 
(43) Ei(x) = i — dt (x > 0) 
Logarithmic integral 

“ dt 
44 li(x) = aa 
(44) =| Ty 


A3.2 Partial Derivatives 


For differentiation formulas, see inside of front cover. 


Let z = f(x, y) be a real function of two independent real variables, x and y. If we keep 
y constant, say, y = y,, and think of x as a variable, then f(x, y,) depends on x alone. If 
the derivative of f(x, y,) with respect to x for a value x = x, exists, then the value of this 
derivative is called the partial derivative of f(x, y) with respect to x at the point (xy, y,) 
and is denoted by 


of Oz 
a or b — 
y Ox 
(a1,.y) @1,y) 
Other notations are 
fe(%q, V1) and Ze (X, Y1)5 


these may be used when subscripts are not used for another purpose and there is no danger 
of confusion. 
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We thus have, by the definition of the derivative, 


a 
() < 


: f(xy + Ax, v1) — fOr, yx) 
lim : 
Axz—>0 Ax 


(1,yp 


The partial derivative of z = f(x, y) with respect to y is defined similarly; we now keep 
x constant, say, equal to x,, and differentiate f(x,, y) with respect to y. Thus 


of 0z 


. f(y, yy + Ay) — fy, yy) 
ay = lim : 


Ay—0 Ay 


(2) 


(1,yp (1,yp 


Other notations are f(x, yy) and z, (x1, y1). 

It is clear that the values of those two partial derivatives will in general depend on the 
point (x, y,). Hence the partial derivatives dz/dx and 0z/dy at a variable point (x, y) are 
functions of x and y. The function 0z/dx is obtained as in ordinary calculus by 
differentiating z = f(x, y) with respect to x, treating y as a constant, and dz/dy is obtained 
by differentiating z with respect to y, treating x as a constant. 


Let z = f(x,y) = xy + x sin y. Then 


of af 
=~ =2xy + siny, ay = x7 + x cos y. ial 


Ox 


The partial derivatives dz/dx and dz/dy of a function z = f(x, y) have a very simple 
geometric interpretation. The function z = f(x, y) can be represented by a surface in 
space. The equation y = y, then represents a vertical plane intersecting the surface in a 
curve, and the partial derivative dz/dx at a point (x, y,) is the slope of the tangent (that 
is, tan a where a is the angle shown in Fig. 557) to the curve. Similarly, the partial 
derivative dz/dy at (x,, y,) is the slope of the tangent to the curve x = x, on the surface 
z = f(x, y) at Gi, yy). 


Fig. 557. Geometrical interpretation of first partial derivatives 
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EXAMPLE 3 


The partial derivatives dz/dx and 0z/dy are called first partial derivatives or partial 
derivatives of first order. By differentiating these derivatives once more, we obtain the 
four second partial derivatives (or partial derivatives of second order)” 


a a (a 
f-2(#)-4, 


ax? ox \ 0x 

of a (of\ _ 

ox dy ax oy Pye 

(3) 

o7f 0 (of 

dy dx dy (=) ~ Fey 
of 0 [ of 
ay? ~ dy ay ~ Fuy- 


It can be shown that if all the derivatives concerned are continuous, then the two mixed 
partial derivatives are equal, so that the order of differentiation does not matter (see Ref. 
[GenRef4] in App. 1), that is, 


a°z a°z 
4 = : 
(4) Ox dy Oy 0x 
For the function in Example 1. 
fee = 2y, fay = 2x + cosy = fya, fyy = —x siny. & 


By differentiating the second partial derivatives again with respect to x and jy, 
respectively, we obtain the third partial derivatives or partial derivatives of the third 
order of f, etc. 


If we consider a function f(x, y, z) of three independent variables, then we have the 
three first partial derivatives f,,(x, y, z), fy(@, y, Zz), and f,(x, y, z). Here f,, is obtained by 
differentiating f with respect to x, treating both y and z as constants. Thus, analogous to 
(1), we now have 


of i f(x, + Ax, yi, %1) — fd, yi, 21) 


= lim 
Ox 


> 
(1,Y1,2) Ax—0 Ax 


etc. By differentiating f,, f,, f, again in this fashion we obtain the second partial 
derivatives of f, etc. 


Let f(x, y, 2) = x2 + y? +c? + xy e*. Then 


fe = 2x + ye, fy = 2 + xe, fe = 22+ xy e*, 
fax = 2, fey = fyx = &, fos = fen = &, 
fyy = 2; fye = fey =x, fer = 2+ xy &. i 


2 CAUTION! In the subscript notation, the subscripts are written in the order in which we differentiate, 
whereas in the “0” notation the order is opposite. 
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A3.3 Sequences and Series 


THEOREM 1 


PROOF 


See also Chap. 15. 


Monotone Real Sequences 


We call a real sequence x1, X2,°**,X,, °° a monotone sequence if it is either monotone 
increasing, that is, 


or monotone decreasing, that is, 


We call x1, X5, - - - a bounded sequence if there is a positive constant K such that |x,,| < K 
for all n. 


If a real sequence is bounded and monotone, it converges. 


Let x1, X2, - +: be a bounded monotone increasing sequence. Then its terms are smaller 
than some number B and, since x; = x, for all , they lie in the interval x, = x, = B, 
which will be denoted by Jp. We bisect Jp; that is, we subdivide it into two parts of equal 
length. If the right half (together with its endpoints) contains terms of the sequence, we 
denote it by /;. If it does not contain terms of the sequence, then the left half of Jp (together 
with its endpoints) is called /,. This is the first step. 

In the second step we bisect J,, select one half by the same rule, and call it J5, and so 
on (see Fig. 558). 

In this way we obtain shorter and shorter intervals Jo, 11, Jo, - - - with the following 
properties. Each /,,, contains all /,, for n > m. No term of the sequence lies to the right 
of J,,, and, since the sequence is monotone increasing, all x,, with n greater than some 
number JA lie in J,,; of course, N will depend on m, in general. The lengths of the /,, 
approach zero as m approaches infinity. Hence there is precisely one number, call it L, 
that lies in all those intervals,? and we may now easily prove that the sequence is 
convergent with the limit L. 

In fact, given an € > 0, we choose an m such that the length of J,,, is less than e. Then 
L and all the x, with n > Mm) lie in J,,, and, therefore, |x,, — L| < e€ for all those n. 
This completes the proof for an increasing sequence. For a decreasing sequence the proof 
is the same, except for a suitable interchange of “left” and “right” in the construction of 
those intervals. a 


3This statement seems to be obvious, but actually it is not; it may be regarded as an axiom of the real number 
system in the following form. Let J;, Jz, - - - be closed intervals such that each J,, contains all J, with n > m, 
and the lengths of the J,,, approach zero as m approaches infinity. Then there is precisely one real number that 
is contained in all those intervals. This is the so-called Cantor—Dedekind axiom, named after the German 
mathematicians GEORG CANTOR (1845-1918), the creator of set theory, and RICHARD DEDEKIND 
(1831-1916), known for his fundamental work in number theory. For further details see Ref. [GenRef2] in App. 1. 
(An interval / is said to be closed if its two endpoints are regarded as points belonging to J. It is said to be open 
if the endpoints are not regarded as points of J.) 
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THEOREM 2 


PROOF 


fy 
— I, 
Fig. 558. Proof of Theorem 1 


Real Series 


Leibniz Test for Real Series 


Let x1, X2, ° + + be real and monotone decreasing to zero, that is, 
(1) (a) xy 2% 2x2: -,— (b) lim x,, = 0. 
m—->co 


Then the series with terms of alternating signs 
Ky = Xo Pag ag 
converges, and for the remainder R,, after the nth term we have the estimate 


(2) IRn| = Xn+1- 


Let s,, be the nth partial sum of the series. Then, because of (1a), 
Sy =X, So =X — X= 84, 
S3 = Sg + X3 = So, S3 = Sy — (X%_ — X3) S 54, 


so that sy S sg S s,. Proceeding in this fashion, we conclude that (Fig. 559) 


(3) Sy 283 285 Zs 2 Sg = 54 = SQ 


which shows that the odd partial sums form a bounded monotone sequence, and so do the 
even partial sums. Hence, by Theorem 1, both sequences converge, say, 


lim Son. = S, lim sy, = s*. 
n->oo Noo 
=x, 
xX. >| 


8 s Ss 


2 3 1 


Fig. 559. Proof of the Leibniz test 
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Now, since 59,41 — San = Xen+1, We readily see that (Ib) implies 


s— s* = lim Sg,,, — lim sg, = lim (So,41 — Sgn) = lim X41 = 0. 
n->co n—-co n—co 


noo 


Hence s* = s, and the series converges with the sum s. 
We prove the estimate (2) for the remainder. Since s,,— s, it follows from (3) that 


Son+1 25 = Son and also Son-1 252 Son- 
By subtracting so, and Sy,,_1, respectively, we obtain 
— > = = = = => = 
Sent1 — San = 8 — Son = O, 02 8 — Son-1 = Son — San-1- 


In these inequalities, the first expression is equal to x5,,,,, the last is equal to —x,,,, and 
the expressions between the inequality signs are the remainders Ry, and Ry,,_;. Thus the 
inequalities may be written 


> > > = 
Xon+1 = Ron = 0, O = Ron-1 = —Xan 


and we see that they imply (2). This completes the proof. a} 


A3.4 Grad, Div, Curl, V? 


in Curvilinear Coordinates 


To simplify formulas, we write Cartesian coordinates x = x1, y = Xo, Z = X3. We denote 
curvilinear coordinates by q1, gz, gg. Through each point P there pass three coordinate 
surfaces gq, = const, g2 = const, gz = const. They intersect along coordinate curves. We 
assume the three coordinate curves through P to be orthogonal (perpendicular to each 
other). We write coordinate transformations as 


(1) Xy = X1(41, Y2, Ys); Xq = X2(41, Y2, q3): X3 = X3(41, Y2, qs). 


Corresponding transformations of grad, div, curl, and V? can all be written by using 


3 a 2 
(2) => (=) . 


me OO 


Next to Cartesian coordinates, most important are cylindrical coordinates g, = r, dz = 9, 
q3 = z (Fig. 560a) defined by 


(3) x, = gy COS qg = r cos 8, Xg = qy SiNgg =r sin 6, X3 = G3 =Z 


and spherical coordinates g, = r, dg. = 9, q3 = & (Fig. 560b) defined by* 


(4) X1 = q1 COS gg sin gg = r cos O sin @, Xy = qy SiN dg sings = r sin 8 sind 


X3 = g, COS gg = rcos ¢. 


“This is the notation used in calculus and in many other books. It is logical since in it, 6 plays the same role 
as in polar coordinates. CAUTION! Some books interchange the roles of 6 and ¢. 
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(a) Cylindrical coordinates (6) Spherical coordinates 


Fig. 560. Special curvilinear coordinates 


In addition to the general formulas for any orthogonal coordinates q,, gz, g3, we shall give 
additional formulas for these important special cases. 


Linear Element ds. In Cartesian coordinates, 


ds? = dx? + dx + dx3 (Sec. 9.5). 
For the g-coordinates, 
(5) ds® = h? dq? + h2 dq? + h? di. 
(5') ds? = dr? + r2-.d@* + dz” (Cylindrical coordinates). 
For polar coordinates set dz? = 
(5") ds® = dr? + r? sin? b dé? + r7- dd? (Spherical coordinates). 


Gradient. grad f = Vf = [ fx,» Fx,» fx,] (partial derivatives; Sec. 9.7). In the 
q-system, with u, v, w denoting unit vectors in the positive directions of the q1, ds, q3 
coordinate curves, respectively, 


6) df=Vf 1 of Ey 1 of i 1 of 
grad f = - u —y w 
hy oq4 hz 92 hz 0q3 
of | of of On ; 
(6’) gradf = Vf = u t+ vt w (Cylindrical coordinates) 
or r 00 Oz 


af 1 af 1 of 


" df=Vf= am + 
oy eae f or 7 rsing 00 ” r od 


w (Spherical coordinates). 


Divergence div F = VeF = (Pye, + (Fang + F's)23 (F = [F 1, Fo, Fs], Sec. 9.8); 


0 0 0 
7) divF =VeF = hgh3Fy) + —— (hghyFs) + —— (hyhoF 
(7) IV Ayltghg E (hgh3Fy) agp (h3hyF2) qs (hyhg »| 
1 a 1 OF, Fs — 
(7') divF =VeF= (rF,) + - — + (Cylindrical coordinates) 
r or r 00 Oz 
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1 OF 5 re 1 a, : 
rsingd 060 rsind dd ee! 


; 1 oa 7 
(7) divF =Ve°F= 3 (r°F,) + 
r° or 


(Spherical coordinates). 


Laplacian V?f = VeVf = div (grad f) = Fan * Fes * Tage, (et: 9-8): 


(8) Vf = 1 | 0 (ue of). 0 (2 f) 4 fe] (= of )| 
hyhghg 0q4 hy Oqy 042 hg 9q2 0q3 hs 0s 


a? 1 0 io" a” 
(8’) Vf = - + _ “ + aa . “ (Cylindrical coordinates) 
af 2 of 1 af 1 af cotd af 
8” v= —.3——— + + 
(8°) I or rs Or r? sin? d 00" — rr? ad? r2 ad 
(Spherical coordinates). 
Curl (Sec. 9.9): 
hu hav h3w 
1 0 0 0 
(9) culF =VxF= - 
hyhghz | 0g, dg2 9q3 


hyF, hoF 5 h3F3 
For cylindrical coordinates we have in (9) (as in the previous formulas) 
h, =h, = 1, hg =he= 41 =7, hz =h, = 1 
and for spherical coordinates we have 


hy = h, = 1, hy = hy = Gy Sin gg = rsin &, i= a =F 


APPENDIX 4 


Additional Proofs 


Section 2.6, page 74 


PROOF OF THEOREM 1 Uniqueness' 
Assuming that the problem consisting of the ODE 


(1) y" + py’ + gay = 0 
and the two initial conditions 
(2) y(%o) = Ko, y' (Xo) = Ky 


has two solutions y,(x) and yo(x) on the interval J in the theorem, we show that their 
difference 


y(X) = yy(x) — yo(x) 


is identically zero on J; then y; = yg on J, which implies uniqueness. 
Since (1) is homogeneous and linear, y is a solution of that ODE on J, and since y, and 
Yg Satisfy the same initial conditions, y satisfies the conditions 


(1) y(Xo) = 0, y' (Xo) = 0. 


We consider the function 
2(x) = y(x)? + y'(x)? 


and its derivative 


ow 


z’ = 2yy’ + 2y’y”. 


From the ODE we have 


”" 


y" = —py' — qy. 
By substituting this in the expression for z’ we obtain 
(12) z! = 2yy’ — 2py’? — ayy’, 
Now, since y and y’ are real, 
Very ay = 29 sy SG 


1This proof was suggested by my colleague, Prof. A. D. Ziebur. In this proof, we use some formula numbers 
that have not yet been used in Sec. 2.6. 
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From this and the definition of z we obtain the two inequalities 


(13) (a) yy’ Sy? +y?=z (b) —2yy' Sy? +y'=z 


From (13b) we have 2yy’ = —z. Together, |2yy’| S z. For the last term in (12) we now 


obtain 
—2gyy" S |—2gyy"| = |all2yy'| = lalz. 
Using this result as well as —p S |p| and applying (13a) to the term 2yy’ in (12), we find 
z’ Sz t 2|ply? + lal 
Since y’? S y? + y’? = z, from this we obtain 
2 S(1 + 2lpl + lab 
or, denoting the function in parentheses by h, 
(14a) zg She for all x on I. 


Similarly, from (12) and (13) it follows that 


ri ’ 12, ’ 
—z' = —2yy’ + py’? +2 
(14b) yy PY qyy 
= z+ 2\|plz + lqlz = hz. 


The inequalities (14a) and (14b) are equivalent to the inequalities 
(15) zg he = 0, gy fhe 20. 
Integrating factors for the two expressions on the left are 

Fy = eW IR ax | Fy = efh@ dx, 


The integrals in the exponents exist because h is continuous. Since F, and F are positive, 
we thus have from (15) 


F(z’ — hz) = (Fi2)' =0 and F(z’ + hz) = (F22)' 2 0. 


This means that Fz is nonincreasing and F’yz is nondecreasing on J. Since z(xg) = 0 by 
(11), when x = Xo we thus obtain 


Fyz = (Fix = 9, Foz 5 (F22)n, = 9 


and similarly, when x = Xo, 
Fiz =0, Foz 2 0. 


Dividing by F and Fy and noting that these functions are positive, we altogether have 
z=0, z20 for all x on I. 


This implies that z = y2 + y’? = 0 on I. Hence y = 0 or y, = yg on I. a 


APP. 4 Additional Proofs 


Section 5.3, page 182 


PROOF OF THEOREM 2. Frobenius Method. Basis of Solutions. Three Cases 
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The formula numbers in this proof are the same as in the text of Sec. 5.3. An additional 
formula not appearing in Sec. 5.3 will be called (A) (see below). 
The ODE in Theorem 2 is 


D(x) oon — F 
x 


(1) y"+ = 0, 


where b(x) and c(x) are analytic functions. We can write it 
(1’) xy” + xb(x)y’ + c(x)y = 0. 
The indicial equation of (1) is 

(4) rir — 1) + bor + co = O. 


The roots 1, re of this quadratic equation determine the general form of a basis of solutions 
of (1), and there are three possible cases as follows. 


Case 1. Distinct Roots Not Differing by an Integer. A first solution of (1) is of the form 
(5) yy(x) = x"1(ag + ayx + ax? + ---) 


and can be determined as in the power series method. For a proof that in this case, the 
ODE (1) has a second independent solution of the form 


(6) Yo(x) = x"2(Ag + Ayx + Agx® + ---), 
see Ref. [A11] listed in App. 1. 


Case 2. Double Root. The indicial equation (4) has a double root r if and only if 
(by — 1)? — 4cq = 0, and then r = 3(1 — bo). A first solution 


my) yi) = x" (dq + ayx + dgx* +++ -), r= 3(1 — bo), 


can be determined as in Case |. We show that a second independent solution is of the 
form 


(8) yo(x) = yy(x) In x + x"(Ayx + Agx? + ---) (x > 0). 


We use the method of reduction of order (see Sec. 2.1), that is, we determine u(x) such 
that yo(x) = u(x)y1(X) is a solution of (1). By inserting this and the derivatives 


yo = u'y, + uy}, yg = uly, + 2u'y, + uy] 
into the ODE (1’) we obtain 


xu", + 2u'y; + uy) + xb(u'y; + uy) + cuy, = 0. 
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Since y, is a solution of (1"), the sum of the terms involving u is zero, and this equation 
reduces to 


x’yyu" + 2x?yiu! + xby,u! = 0. 


By dividing by x*y, and inserting the power series for b we obtain 


Here, and in the following, the dots designate terms that are constant or involve positive 
powers of x. Now, from (7), it follows that 


Si 8 ag Ft Dae +e 


Vy xX"[dg +ayxt--:| 


1 (ote t Dew Fo) r 


x dg Fayxrss: 


Hence the previous equation can be written 


or +b 
(A) w+ (2A 4...) u’ =0. 


Xx 


Since r = (1 — bo)/2, the term (2r + bo)/x equals 1/x, and by dividing by u’ we thus 
have 


By integration we obtain Inu’ = —Inx + ---, hence u’ = (1/x)e“ °°”. Expanding the 
exponential function in powers of x and integrating once more, we see that u is of the form 


u=Inx + kx t+ kyx? t---. 
Inserting this into y, = uy,, we obtain for yz a representation of the form (8). 


Case 3. Roots Differing by an Integer. We write 7; = r and rg = r — p where p isa 
positive integer. A first solution 


(9) yi(x) = x" (dg + ayx + dox? + +++) 


can be determined as in Cases 1 and 2. We show that a second independent solution is 
of the form 


(10) yo(x) = kyy(x) Inx + x"2(Ag + Ayx + Aox” Sh eto) 


where we may have k # 0 or k = 0. As in Case 2 we set yp = uy. The first steps are 
literally as in Case 2 and give Eq. (A), 


urs (4b re] u' = 0. 


x 
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Now by elementary algebra, the coefficient by — 1 of r in (4) equals minus the sum of 
the roots, 


bo — 1 = -(4 + re) = -— +r - p) = -2r + p. 


Hence 2r + by = p + 1, and division by u’ gives 


" 
Pe(ee wee), 
u! x 


The further steps are as in Case 2. Integrating, we find 


Inu’ = -(p + 1)Inx+---, thus ul =x PPE? 


where dots stand for some series of nonnegative integer powers of x. By expanding the 
exponential function as before we obtain a series of the form 


kp 
+ Et kia + kpeax toe, 


We integrate once more. Writing the resulting logarithmic term first, we get 


1 


1 ky 
u=k,Inx + SE 
x 


Hence, by (9) we get for yp = uy, the formula 
Yo = kpyy Inx + qo" ( =F i Pee a ee ) (dg + ayx +++). 


But this is of the form (10) with k = k, since r; — p = rg and the product of the two 
series involves nonnegative integer powers of x only. a 


Section 7.7, page 293 


THEOREM Determinants 


The definition of a determinant 


a1 12 —- Gin 

a1 A292 s don 
(7) D = detA = 

ani ang Oe Ann 


as given in Sec. 7.7 is unambiguous, that is, it yields the same value of D no matter 
which rows or columns we choose in the development. 
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In this proof we shall use formula numbers not yet used in Sec. 7.7. 
We shall prove first that the same value is obtained no matter which row is chosen. 
The proof is by induction. The statement is true for a second-order determinant, for 
which the developments by the first row a4,dg2 + dy2(—d2;) and by the second row 
d21(— 42) + dg2Q11 give the same value d11d22 — dj2d2,. Assuming the statement to be 
true for an (7 — 1)st-order determinant, we prove that it is true for an mth-order determinant. 


For this purpose we expand D in terms of each of two arbitrary rows, say, the ith and 
the jth, and compare the results. Without loss of generality let us assume i < j. 
First Expansion. We expand D by the ith row. A typical term in this expansion is 
(19) Ai Cig = Gi, * (— 1) Mix. 


The minor M;;, of a;;, in D is an (n — 1)st-order determinant. By the induction hypothesis 
we may expand it by any row. We expand it by the row corresponding to the jth row of 
D. This row contains the entries aj, (J # k). It is the (j — 1)st row of Mj, because Mj, 
does not contain entries of the ith row of D, and i < j. We have to distinguish between 
two cases as follows. 


Case I. If |< k, then the entry a,, belongs to the /th column of Mj, (see Fig. 561). Hence 
the term involving a,, in this expansion is 


(20) ail ° (cofactor of Qt in Mj) _ Qj . (- 1)9—P* Mii 


where M;,;, is the minor of aj, in Mj. Since this minor is obtained from M;,, by deleting 
the row and column of ajz, it is obtained from D by deleting the ith and jth rows and the 
kth and /th columns of D. We insert the expansions of the M,,, into that of D. Then it follows 
from (19) and (20) that the terms of the resulting representation of D are of the form 
(21a) aircdjr* (—1)?Mire (<k) 
where 
b=itkt+j+I1—-1. 
Case II. If | > k, the only difference is that then a, belongs to the (J — 1)st column of 


M,;;,, because M;;, does not contain entries of the kth column of D, and k < /. This causes 
an additional minus sign in (20), and, instead of (21a), we therefore obtain 


(21b) = igi * (—1)°Mineji (1 > k) 
where bD is the same as before. 


Ith kth kth ith 
col. col. col. col. 


= - | ith row ay — 
I ] 
| | 

Gi jth row (%)- 
| | 
I | 
| | 


Case I Case II 


Fig. 561. Cases | and II of the two expansions of D 
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Second Expansion. We now expand D at first by the jth row. A typical term in this 
expansion is 


(22) aj Cir = ayy" (— 1M. 


By the induction hypothesis we may expand the minor M,, of a; in D by its ith row, which 
corresponds to the ith row of D, since j > i. 


Case I. If k > 1, the entry a; in that row belongs to the (k — 1)st column of M;;, because 
M,, does not contain entries of the /th column of D, and / < k (see Fig. 561). Hence the 
term involving a,;, in this expansion is 


(23) ai? (cofactor of Qik in Mj) = ayx* (- Ly OM as 


where the minor M;,,;; of aj, in Mj, is obtained by deleting the ith and jth rows and the 
kth and /th columns of D [and is, therefore, identical with M;;,;; in (20), so that our notation 
is consistent]. We insert the expansions of the M;; into that of D. It follows from (22) and 
(23) that this yields a representation whose terms are identical with those given by (21a) 
when / < k. 


Case II. If k <1, then aj, belongs to the kth column of M;;, we obtain an additional minus 
sign, and the result agrees with that characterized by (21b). 


We have shown that the two expansions of D consist of the same terms, and this proves 
our statement concerning rows. 

The proof of the statement concerning columns is quite similar; if we expand D in 
terms of two arbitrary columns, say, the kth and the /th, we find that the general term 
involving aja; is exactly the same as before. This proves that not only all column 
expansions of D yield the same value, but also that their common value is equal to the 
common value of the row expansions of D. 

This completes the proof and shows that our definition of an nth-order determinant is 
unambiguous. ia 


Section 9.3, page 368 
PROOF OF FORMULA (2) 
We prove that in right-handed Cartesian coordinates, the vector product 
v=aXb=[a, ds, a3] X [b, be, bs| 
has the components 


(2) V1 = agb3 — azbo, Vg = d3b, — abs, V3 = aybz — agby. 


We need only consider the case v # 0. Since v is perpendicular to both a and b, Theorem 
1 in Sec. 9.2 gives a * v = 0 and b * v = 0; in components [see (2), Sec. 9.2], 


a1V1 + Ags + dg’g = 0 


3 
( ) by, + byVe T bzV3 = 0. 
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Multiplying the first equation by bs, the last by a3, and subtracting, we obtain 
(agb — a4b3)V1 = (dgb3z — agb2)v2. 

Multiplying the first equation by b,, the last by a,, and subtracting, we obtain 
(aybz — dgb,)V2 = (43h, — ayb3)v3. 

We can easily verify that these two equations are satisfied by 

(4) V1 = C(agb3 — azbg), Uz = C(a3by — ayb3), U3 = C(aybe — aby) 


where c is a constant. The reader may verify, by inserting, that (4) also satisfies (3). Now 
each of the equations in (3) represents a plane through the origin in VU gU3-space. The 
vectors a and b are normal vectors of these planes (see Example 6 in Sec. 9.2). Since 
v # 0, these vectors are not parallel and the two planes do not coincide. Hence their 
intersection is a straight line L through the origin. Since (4) is a solution of (3) and, for 
varying c, represents a straight line, we conclude that (4) represents L, and every solution 
of (3) must be of the form (4). In particular, the components of v must be of this form, 
where c is to be determined. From (4) we obtain 


Iv|? = vi ate v3 + v3 = c*[(aabs = a3by)” + (dgb, — ayb3)" + (ayby — dyb,)"]. 
This can be written 
Iv? = c?[(aZ + af + a3)(b? + bz + b3) — (ayby + agbe + azbs)°], 


as can be verified by performing the indicated multiplications in both formulas and 
comparing. Using (2) in Sec. 9.2, we thus have 


lv? = c?[(a * a)(b * b) — (a+ by]. 


By comparing this with formula (12) in Prob. 4 of Problem Set 9.3 we conclude that 
c=Hl1. 

We show that c = +1. This can be done as follows. 

If we change the lengths and directions of a and b continuously and so that at the end 
a = i and b = j (Fig. 188a in Sec. 9.3), then v will change its length and direction 
continuously, and at the end, v = i X j = k. Obviously we may effect the change so that 
both a and b remain different from the zero vector and are not parallel at any instant. 
Then v is never equal to the zero vector, and since the change is continuous and c can 
only assume the values +1 or —1, it follows that at the end c must have the same value 
as before. Now at the end a = i, b = j, v = K and, therefore, a, = 1, by = 1, v3 = 1, 
and the other components in (4) are zero. Hence from (4) we see that v3 = c = +1. This 
proves Theorem 1. 

For a left-handed coordinate system, i X j = —k (see Fig. 188b in Sec. 9.3), resulting 
in c = —1. This proves the statement right after formula (2). ia 


APP. 4 Additional Proofs A85 


Section 9.9, page 408 


PROOF OF THE 


INVARIANCE OF THE CURL 


This proof will follow from two theorems (A and B), which we prove first. 


THEOREM A 


Transformation Law for Vector Components 


For any vector v the components V1, Vg, V3 and v}<, VS, Vs in any two systems of 
Cartesian coordinates x, X9, X3 and x*, x*, x¥, respectively, are related b 
1X2, X3 1> X2, X3 


Up = Cq1Vy + CyQVe + Cy3V3 


(1) UZ = CoV1 + CogVe + CogV3 
k 
V3 = C31V1 + C3QVeq t+ C33U3, 


and conversely 


_ * * * 
Vy = CUT + Cg1V_ + C31V3 

= * * * 

(2) Ug = CyQU7 + Cogl3 + C39V3 


_ * * * 
U3 = Cy3U7 + Co3l3 + C3303 


with coefficients 


Cy = i* i Cig = i*ej Cig = i*ek 
(3) Coy = jFri Cop = Jj Cog = j*ek 
C3, = k*ei C3 = k* ej C33 = k**k 
satisfying 
3 
(4) > CeiCmg = Sum —«s(K, m = 1, 2, 3), 


j=l 
where the Kronecker delta” is given by 
[ (k # m) 
] (k = m) 


and i, j, k and i*, j*, k* denote the unit vectors in the positive x1-, X-, X3- and 
xt-, x3-, x$-directions, respectively. 


2LEOPOLD KRONECKER (1823-1891), German mathematician at Berlin, who made important 
contributions to algebra, group theory, and number theory. 


We shall keep our discussion completely independent of Chap. 7, but readers familiar with matrices should 


recognize that we are dealing with orthogonal transformations and matrices and that our present theorem 
follows from Theorem 2 in Sec. 8.3. 
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The representation of v in the two systems are 
(5) (a) v= v,4i + Voj + Usk (b) v = vii*® + vaj* + vsk*. 


Since i* ¢ i* = 1, i* * j* = 0, i* « k* = 0, we get from (5b) simply i* * v = vj and 
from this and (5a) 


vi = Pe vy = i* © Ui + I* © Voj + 1 © Usk = Uy ¢ i + Uoi® © j + vsi* © k. 


Because of (3), this is the first formula in (1), and the other two formulas are obtained 
similarly, by considering j* * v and then k* « v. Formula (2) follows by the same idea, 
taking i * v = v, from (5a) and then from (5b) and (3) 


VU, =iev =vieie i* + vSi es j* + vei k* = cyvt + cous + cy4V4, 
and similarly for the other two components. 
We prove (4). We can write (1) and (2) briefly as 


3 


3 
(6) (a) uv; = S ree bo => Cygj V5. 
jel 


m=1 
Substituting uv; into vz, we get 


3 3 3 3 
v= > Ckj = Culm = Pas (5 “uta , 
j=l m=1 m= 


rv 


where k = 1, 2, 3. Taking k = 1, we have 


3 3 3 
vt = vt > C15 C1; + vs ys C15 C25 + v3 > C1535 é 
j=l j=1 
For this to hold for every vector v, the first sum must be | and the other two sums 0. This 
proves (4) with k = | form = 1, 2, 3. Taking k = 2 and then k = 3, we obtain (4) with 
k = 2 and 3, for m = 1, 2, 3. 


Transformation Law for Cartesian Coordinates 


The transformation of any Cartesian x,XgxX3-coordinate system into any other 
Cartesian x{x5x3-coordinate system is of the form 


3 
(7) x= D>) Gogh + Bins m= 1,2, 3, 
j=l 


with coefficients (3) and constants by, be, bg; conversely, 


3 
(8) X_ = Dd) Capxt + By, = 1.09, 


n=1 
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Theorem B follows from Theorem A by noting that the most general transformation of a 
Cartesian coordinate system into another such system may be decomposed into a 
transformation of the type just considered and a translation; and under a translation, 
corresponding coordinates differ merely by a constant. 


PROOF OF THE INVARIANCE OF THE CURL 
We write again x1, X5, X3 instead of x, y, z, and similarly x}, x3, x for other Cartesian 
coordinates, assuming that both systems are right-handed. Let a, dy, a3 denote the 
components of curl v in the x,x2x3-coordinates, as given by (1), Sec. 9.9, with 


xX = X41, y = Xa, Z = Xz. 


Similarly, let a7, a3, a¥ denote the components of curl v in the xjx3x$-coordinate system. 
We prove that the length and direction of curl v are independent of the particular choice 
of Cartesian coordinates, as asserted. We do this by showing that the components of curl 
v satisfy the transformation law (2), which is characteristic of vector components. We 
consider a,. We use (6a), and then the chain rule for functions of several variables (Sec. 
9.6). This gives 


3 
dv du dv* dp* 
a= _ 3 pe (ens Fy m — Cme2 5 ide 
0X5 0X3 m=1 x2 x3 
a) 2 dy* x; dy* axz 
= c a 
DD {6m OX; Xa m2 ax* OKs 
m=1j=1 
From this and (7) we obtain 
3 3 * 
Om 
a, = » = (CrngCje -_ Cm2Cj3) an 
m=1 j=1 
dvs avs 
= (€33Co2 — C32C23) oxe axe 


= " * * 
= (CggCo2 — Cg2Co3)aq + (CigC32 — C12C33)az + (CogCi2 — Co2C13)a3- 


Note what we did. The double sum had 3 X 3 = 9 terms, 3 of which were zero (when 

m = j), and the remaining 6 terms we combined in pairs as we needed them in getting 
cy oy cy 

a1, a2, 43. 


We now use (3), Lagrange’s identity (see Formula (15) in Team Project 24 in Problem 
Set 9.3) and k* x j* = —i* andk X j = —i. Then 


C33Co2 — C32Co3 = (k* * k)(j* * j) — (k* * ()G* * k) 


= (k* x j*) * (kx j) =i i= cy, ete. 


A8&8& 


APP. 4 Additional Proofs 


Hence a, = cy,a~ + co\a% + c3,a%. This is of the form of the first formula in (2) in 
Theorem A, and the other two formulas of the form (2) are obtained similarly. This proves 
the theorem for right-handed systems. If the x,x2x3-coordinates are left-handed, then 
k X j = +i, but then there is a minus sign in front of the determinant in (1), Sec. 9.9. Ml 


Section 10.2, page 420 


PROOF OF THEOREM 1, PART (b)_ We prove that if 


(1) | Fe dr = | dx + Fy dy + Fs dz) 


with continuous F,, Fy, F3 in a domain D is independent of path in D, then F = grad f 
in D for some f; in components 


Q) Aa fips = 
1 Ox? 2 dy’ 3 Oz 


We choose any fixed A: (Xo, Yo, Zo) in D and any B: (x, y, z) in D and define f by 
B 
3) f(x,y. 2) = fo + [ (Fy ae* + Fy dy* + Fs de*) 
A 


with any constant fg and any path from A to B in D. Since A is fixed and we have 
independence of path, the integral depends only on the coordinates x, y, z, so that (3) 
defines a function f(x, y, z) in D. We show that F = grad f with this f, beginning with 
the first of the three relations (2’). Because of independence of path we may integrate 
from A to By: (x1, y, z) and then parallel to the x-axis along the segment B,B in Fig. 562 
with B, chosen so that the whole segment lies in D. Then 


B, B 
f(x, y, Zz) = fo + | (F, dx* + Fy dy* + Fs dz*) + | (F, dx* + F, dy* + Fs dz*). 
A By 


We now take the partial derivative with respect to x on both sides. On the left we get 
of/ox. We show that on the right we get F,. The derivative of the first integral is zero 
because A: (Xo, Yo, Zo) and By: (x1, y, z) do not depend on x. We consider the second 
integral. Since on the segment B,B, both y and z are constant, the terms Fy dy* and 


x 


Fig. 562. Proof of Theorem 1 
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F3 dz* do not contribute to the derivative of the integral. The remaining part can be written 
as a definite integral, 


B x 
| Fy dx* = | Fy(x*, y, z) dx*. 
By, xy 


Hence its partial derivative with respect to x is F(x, y, z), and the first of the relations 
(2') is proved. The other two formulas in (2’) follow by the same argument. B 
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THEOREM 


PROOF 


Reality of Eigenvalues 


If p,q, 7, and p' in the Sturm-Liouville equation (1) of Sec. 11.5 are real-valued and 
continuous on the interval a = x S&S b and r(x) > 0 throughout that interval (or 
r(x) < 0 throughout that interval), then all the eigenvalues of the Sturm—Liouville 
problem (1), (2), Sec. 11.5, are real. 


Let A = a + if be an eigenvalue of the problem and let 
y(x) = u(x) + iva) 


be a corresponding eigenfunction; here a, 6, u, and v are real. Substituting this into (1), 
Sec. 11.5, we have 


(pu' + ipv')' + (q + ar + iBr(u + iv) = 0. 


This complex equation is equivalent to the following pair of equations for the real and 
the imaginary parts: 
(pu')' + (¢ + aru — Brv = 0 


(pu')' + (q + aru + Bru = 0. 
Multiplying the first equation by v, the second by —u and adding, we get 


— Bu? + v?)r = u(pu')’ — v(pu'y’ 


= [(pu')u — (pu')yv]’. 


The expression in brackets is continuous on a = x S b, for reasons similar to those in 
the proof of Theorem 1, Sec. 11.5. Integrating over x from a to b, we thus obtain 


b 


b 
pf (u2 + v*)r dx = te’ a vo)| 


a 


Because of the boundary conditions, the right side is zero; this is as in that proof. Since 
y is an eigenfunction, u? + v? # 0. Since y and r are continuous and r > 0 (or r < 0) 
on the interval a = x S 5, the integral on the left is not zero. Hence, 8 = 0, which means 
that A = ais real. This completes the proof. a 
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Section 13.4, page 627 


PROOF OF THEOREM 2. Cauchy—Riemann Equations 


We prove that Cauchy—Riemann equations 
(1) uy, =U u, = —V,z 


are sufficient for a complex function f(z) = u(x, y) + iv(x, y) to be analytic; precisely, if 
the real part u and the imaginary part v of f(z) satisfy (1) in a domain D in the complex 
plane and if the partial derivatives in (1) are continuous in D, then f(z) is analytic in D. 


In this proof we write Az = Ax + iAy and Af = f(z + Az) — f(z). The idea of proof 
is as follows. 


(a) We express Af in terms of first partial derivatives of u and v, by applying the mean 
value theorem of Sec. 9.6. 


(b) We get rid of partial derivatives with respect to y by applying the Cauchy—Riemann 
equations. 


(c) We let Az approach zero and show that then Af/Az, as obtained, approaches a limit, 
which is equal to wu, + iv,, the right side of (4) in Sec. 13.4, regardless of the way of 
approach to zero. 


(a) Let P: (x, y) be any fixed point in D. Since D is a domain, it contains a neighborhood 
of P. We can choose a point Q: (x + Ax, y + Ay) in this neighborhood such that the 
straight-line segment PQ is in D. Because of our continuity assumptions we may apply 
the mean value theorem in Sec. 9.6. This yields 


u(x + Ax, y + Ay) — ux, y) = (Ax)u,(My) + (Ay)u,(My) 


v(x + Ax, y + Ay) — v@, y) = (Ax)uz(M2) + (Ay)vy(M2) 


where M, and M, (# M, in general!) are suitable points on that segment. The first line 
is Re Af and the second is Im Af, so that 


Af = (Ax)u,(M,) + (Ay)uy(M,) + i[(Ax)v,(M2) + (Ay)v,(Mp)]. 
(b) u, = —v,, and v, = u, by the Cauchy—Riemann equations, so that 


Af = (Ax)u,(M1) — (Ay)v,(My) + i[(Ax)v, (Mz) + (Ay)u;,(M3)].- 


Also Az = Ax + iAy, so that we can write Ax = Az — iAy in the first term and 
Ay = (Az — Ax)/i = —i(Az — Ax) in the second term. This gives 


Af = (Az — iAy)u,(M,) + i(Az — Ax)v,(My) + i[(Ax)u,(M2) + (Ay)u,(Mp))- 
By performing the multiplications and reordering we obtain 


Af = (Az)u,(M,) — iAy{u,(My) — u,(M2)} 
+ i[(Az)v(M,) ~ Ax{v,(M,) ~ Vx(My)} ]. 
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Division by Az now yields 


A iA j 
(i I ann agi — = ie =) = = he = 
Az Az Az 


(c) We finally let Az approach zero and note that |Ay/Az| = 1 and |Ax/Az| = 1 in (A). 
Then Q: (x + Ax, y + Ay) approaches P: (x, y), so that M, and M, must approach P. 
Also, since the partial derivatives in (A) are assumed to be continuous, they approach 
their value at P. In particular, the differences in the braces {- - -} in (A) approach zero. 
Hence the limit of the right side of (A) exists and is independent of the path along which 
Az — 0. We see that this limit equals the right side of (4) in Sec. 13.4. This means that 
f(z) is analytic at every point z in D, and the proof is complete. si 


Section 14.2, pages 653-654 


GOURSAT’S PROOF OF CAUCHY’S INTEGRAL THEOREM  Goursat proved Cauchy’s 
integral theorem without assuming that f’(z) is continuous, as follows. 

We start with the case when C is the boundary of a triangle. We orient C 
counterclockwise. By joining the midpoints of the sides we subdivide the triangle into 
four congruent triangles (Fig. 563). Let Cy, Cy, Cr, Cry denote their boundaries. We 
claim that (see Fig. 563). 


(1) Pfd=P fir p facr$ facrh f dz. 


Civ 


Indeed, on the right we integrate along each of the three segments of subdivision in both 
possible directions (Fig. 563), so that the corresponding integrals cancel out in pairs, and 
the sum of the integrals on the right equals the integral on the left. We now pick an integral 
on the right that is biggest in absolute value and call its path C,. Then, by the triangle 
inequality (Sec. 13.2), 


If Fae 


We now subdivide the triangle bounded by C, as before and select a triangle of 
subdivision with boundary Cy for which 


~ - ~ 


=|f ra f fac f fae f fa s4|f fae 


=4 


Then | p fad p f dz| . 


YJ 


Fig. 563. Proof of Cauchy’s integral theorem 
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Continuing in this fashion, we obtain a sequence of triangles T,, Ts, - + - with boundaries 

C,, Cy, - + + that are similar and such that 7, lies in 7,,, when n > m, and 

(2) If ra sa" lp faz), n=1,2,-+-°. 
Cc Ch 


Let Zo be the point that belongs to all these triangles. Since f is differentiable at z = Zo, 
the derivative f’ (zo) exists. Let 


f '(Z): 


& nig) =< £0. = £60) 
one 


Solving this algebraically for f(z) we have 
f@ = FE) + (& — z)f' Zo) + ADE = Z)- 


Integrating this over the boundary C,, of the triangle 7,, gives 


f fe de=$ feo) det (c— eodf' eo) +h WEE — zo)dz. 
Cp Ch Ch C, 


Since f(zp) and f' (zp) are constants and C,, is a closed path, the first two integrals on the 
right are zero, as follows from Cauchy’s proof, which is applicable because the integrands 
do have continuous derivatives (0 and const, respectively). We thus have 


fp fedz=4 hilz x) de. 
Cy Ce 


Since f (Zo) is the limit of the difference quotient in (3), for given € > O we can find a 
6 > 0 such that 


(4) |A(z)| < € when lIe— zl < 2. 
We may now take n so large that the triangle T,, lies in the disk |z — zp| < 6. Let L,, be 


the length of C,,. Then |z — zp| < L,, for all z on C,, and Zp in 7,,. From this and (4) we 
have |h(z)(z — Z)| < €L,. The ML-inequality in Sec. 14.1 now gives 


= eL,:L, = €L2. 


(5) | $f fo) del = | P We\(z — z9) dz 
Ch Ch 

Now denote the length of C by L. Then the path C, has the length L, = L/2, the path C, 

has the length L, = L,/2 = L/4, etc., and C,, has the length L, = L/2”. Hence 

L?2 = 17/4". From (2) and (5) we thus obtain 


If ra f fds 


Ch 
By choosing € (> 0) sufficiently small we can make the expression on the right as small 
as we please, while the expression on the left is the definite value of an integral. 
Consequently, this value must be zero, and the proof is complete. 


2 


n n 2 n L 2 
=4 = Mel, = 4 Tq = EL. 
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The proof for the case in which C is the boundary of a polygon follows from the previous 
proof by subdividing the polygon into triangles (Fig. 564). The integral corresponding to 
each such triangle is zero. The sum of these integrals is equal to the integral over C, 
because we integrate along each segment of subdivision in both directions, the 
corresponding integrals cancel out in pairs, and we are left with the integral over C. 

The case of a general simple closed path C can be reduced to the preceding one by 
inscribing in C a closed polygon P of chords, which approximates C “sufficiently 
accurately,” and it can be shown that there is a polygon P such that the integral over P 
differs from that over C by less than any preassigned positive real number €, no matter 
how small. The details of this proof are somewhat involved and can be found in Ref. [D6] 
listed in App. 1. | 


Fig. 564. Proof of Cauchy’s integral theorem for a polygon 


Section 15.1, page 674 


PROOF OF THEOREM 4 _ Cauchy’s Convergence Principle for Series 


(a) In this proof we need two concepts and a theorem, which we list first. 


1. A bounded sequence sj, sz, - +: is a sequence whose terms all lie in a disk of 
(sufficiently large, finite) radius K with center at the origin; thus |s,,| < K for all n. 


2. A limit point a of a sequence sj, sy, - + - is a point such that, given an € > 0, there 
are infinitely many terms satisfying |s,, — al < ¢. (Note that this does not imply 
convergence, since there may still be infinitely many terms that do not lie within that 
circle of radius € and center a.) 


Example: 4, 2, $$ 76 t= °° * has the limit points 0 and 1 and diverges. 


3. A bounded sequence in the complex plane has at least one limit point. 
(Bolzano—Weierstrass theorem; proof below. Recall that “sequence” always means infinite 
sequence.) 


(b) We now turn to the actual proof that z,; + z. + --- converges if and only if, for 
every € > 0, we can find an N such that 


(1) Ryan 2 eal SE for every n > N and p = 1, 2,--- 
Here, by the definition of partial sums, 

Snap Sp Sasa POP Gye: 
Writing n + p = r, we see from this that (1) is equivalent to 


(1*) |, — S,| < € for all r > Nandn > N. 
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Suppose that 51, sy, - - - converges. Denote its limit by s. Then for a given € > 0 we can 
find an N such that 


€ 
[s=s| <— for every n > N. 
2 


Hence, if r > N and n > N, then by the triangle inequality (Sec. 13.2), 


s,, Sp -_ I(s, s) (Sy, s)| = Is, — 5| + Sin = 5| < 5 + = €, 


€ 
2 
that is, (1*) holds. 


(c) Conversely, assume that s,, 59, --- satisfies (1*). We first prove that then the 
sequence must be bounded. Indeed, choose a fixed ¢€ and a fixed n = ng > N in (1*). 
Then (1*) implies that all s, with r > N lie in the disk of radius € and center s,,, and only 
finitely many terms sy, ° ++ , Sy May not lie in this disk. Clearly, we can now find a circle 
so large that this disk and these finitely many terms all lie within this new circle. Hence 
the sequence is bounded. By the Bolzano—Weierstrass theorem, it has at least one limit 
point, call it s. 

We now show that the sequence is convergent with the limit s. Let e > 0 be given. 
Then there is an N* such that |s, — s,,| < €/2 for all r > N* and n > N*, by (1*). Also, 
by the definition of a limit point, |s,, — s] < €/2 for infinitely many n, so that we can find 
and fix ann > N* such that |s,, — s| < €/2. Together, for every r > N*, 


€e € 
ls, = s| = |(s, = s,) + Gs, = 8)| Sls, = s,| + |s, = s| < 5 + 5 


that is, the sequence sj, Sz, °- - is convergent with the limit s. a 


Bolzano—Weierstrass Theorem? 


A bounded infinite sequence Z,, Z2, Z3, °* * in the complex plane has at least one 
limit point. 


It is obvious that we need both conditions: a finite sequence cannot have a limit point, 
and the sequence 1, 2, 3, -- - , which is infinite but not bounded, has no limit point. To 
prove the theorem, consider a bounded infinite sequence z,, Z, - - - and let K be such that 
lZn| < K for all n. If only finitely many values of the z, are different, then, since the 
sequence is infinite, some number z must occur infinitely many times in the sequence, 
and, by definition, this number is a limit point of the sequence. 

We may now turn to the case when the sequence contains infinitely many different 
terms. We draw a large square Qj that contains all z,,. We subdivide Qy into four congruent 
squares, which we number 1, 2, 3, 4. Clearly, at least one of these squares (each taken 
with its complete boundary) must contain infinitely many terms of the sequence. The 
square of this type with the lowest number (1, 2, 3, or 4) will be denoted by Qj. This is 


3BERNARD BOLZANO (1781-1848), Austrian mathematician and professor of religious studies, was a 
pioneer in the study of point sets, the foundation of analysis, and mathematical logic. 
For Weierstrass, see Sec. 15.5. 
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the first step. In the next step we subdivide Q, into four congruent squares and select a 
square Q, by the same rule, and so on. This yields an infinite sequence of squares Qo, 
Q,, Qo,° ++ ,Qn,* °° with the property that the side of Q,, approaches zero as n approaches 
infinity, and Q,, contains all Q,, with n > m. It is not difficult to see that the number 
which belongs to all these squares,* call it z = a, is a limit point of the sequence. In fact, 
given an € > 0, we can choose an N so large that the side of the square Qy is less than 
€ and, since Qy contains infinitely many z,, we have |z,, — a| < € for infinitely many n. 
This completes the proof. o 


Section 15.3, pages 688-689 


PART (b) OF THE PROOF OF THEOREM 5 
We have to show that 


= (z + Az)” — 7” say 
2 = | Az ~ 
- > An, Azl(z + Az)y?-? + 2z(z + Ag)" 3 +--+ 4 - Iz", 
nN=2 
thus, 
a 
Az 


= Ad(z + Az)? + 2z(z + AZ” 3 4-5-5 4+ (n- Dz? ]. 


If we set z + Az = b and z = a, thus Az = b — a, this becomes simply 


b” — qa” 


(7a) b-a 


— na" = (b -— aA, (n = 2,3,-°°), 


where A,, is the expression in the brackets on the right, 
(7b) Aga? lan? = 3a ee ko eae Dee 


thus, Ay = 1, Az = b + 2a, etc. We prove (7) by induction. When n = 2, then (7) holds, 
since then 


b? — a (b + a\(b — a) 


=~ 2a — 2a=b-—az=(b-— ajAs. 


Assuming that (7) holds for n = k, we show that it holds forn = k + 1. By adding and 
subtracting a term in the numerator and then dividing we first obtain 


peti — gk+l pet) — pa® + bak — gk? pe — ok 
= =b + a™. 
b-a b-a b-a 


* The fact that such a unique number z = a exists seems to be obvious, but it actually follows from an axiom 
of the real number system, the so-called Cantor—Dedekind axiom: see footnote 3 in App. A3.3. 
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By the induction hypothesis, the right side equals b[(b — ajA; + ka | + a™. Direct 
calculation shows that this is equal to 


(b — a){bA,, + ka®-1) + aka®-! + al. 
From (7b) with n = k we see that the expression in the braces {- - -} equals 
bE-1 + Qab-? + --- + (Kk — 1)ba®-? + ka®-1 = A, 3. 


Hence our result is 


perl _ qhtl 


= (b — a)Aga, + (k+ Da". 
b-a 


Taking the last term to the left, we obtain (7) with n = k + 1. This proves (7) for any 
integer n = 2 and completes the proof. ey 
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ANOTHER PROOF OF THEOREM 1. without the use of a harmonic conjugate 


We show that if w = u + iv = f(z) is analytic and maps a domain D conformally onto 
a domain D* and ®*(u, v) is harmonic in D*, then 


(1) D(x, y) = B* (u(x, y), v(x, y)) 


is harmonic in D, that is, V7 = 0 in D. We make no use of a harmonic conjugate of 
@*, but use straightforward differentiation. By the chain rule, 


®, = OF uw, + B* v,. 
We apply the chain rule again, underscoring the terms that will drop out when we form 
V0: 


®,,, is the same with each x replaced by y. We form the sum V7. In it, b*, = O*, is 
multiplied by 


WUE UyVy 
which is 0 by the Cauchy-Riemann equations. Also V2u = 0 and Vv = 0. There remains 
VD = O*(u2 + u2) + O*,(v2 + v2). 
By the Cauchy—Riemann equations this becomes 
Vb = (4, + OE ud + v2) 


and is 0 since ®* is harmonic. 


APPENDIX 5 


Tables 


For Tables of Laplace Transforms see Secs. 6.8 and 6.9. 
For Tables of Fourier Transforms see Sec. 11.10. 


If you have a Computer Algebra System (CAS), you may not need the present tables, 
but you may still find them convenient from time to time. 


Table Al_ Bessel Functions 
For more extensive tables see Ref. [GenRef1l] in App. 1. 


x Jo(x) JX) x Jo(x) J) x Jo(x) JQ) 
0.0 1.0000 0.0000 3.0 —0.2601 0.3391 6.0 0.1506 —0.2767 
0.1 0.9975 0.0499 3.1 —0.2921 0.3009 6.1 0.1773 —0.2559 
0.2 0.9900 0.0995 32 —0.3202 0.2613 6.2 0.2017 —0.2329 
0.3 0.9776 0.1483 3.3 —0.3443 0.2207 6.3 0.2238 —0.2081 
0.4 0.9604 0.1960 3.4 —0.3643 0.1792 6.4 0.2433 —0.1816 
0.5 0.9385, 0.2423 3.5 —0.3801 0.1374 6.5 0.2601 —0.1538 
0.6 0.9120 0.2867 3.6 —0.3918 0.0955 6.6 0.2740 —0.1250 
0.7 0.8812 0.3290 3.7 —0.3992 0.0538 6.7 0.2851 —0.0953 
0.8 0.8463 0.3688 3.8 —0.4026 0.0128 6.8 0.2931 —0.0652 
0.9 0.8075 0.4059 3.9 —0.4018 —0.0272 6.9 0.2981 —0.0349 
1.0 0.7652 0.4401 4.0 —0.3971 —0.0660 7.0 0.3001 —0.0047 
1.1 0.7196 0.4709 4.1 —0.3887 —0.1033 7A 0.2991 0.0252 
1.2 0.6711 0.4983 4.2 —0.3766 —0.1386 7.2 0.2951 0.0543 
1.3 0.6201 0.5220 4.3 —0.3610 —0.1719 7.3 0.2882 0.0826 
1.4 0.5669 0.5419 4.4 —0.3423 —0.2028 74 0.2786 0.1096 
1.5 0.5118 0.5579 4.5 —0.3205 —0.2311 TS 0.2663 0.1352 
1.6 0.4554 0.5699 4.6 —0.2961 —0.2566 7.6 0.2516 0.1592 
1.7 0.3980 0.5778 4.7 —0.2693 —0.2791 tii 0.2346 0.1813 
1.8 0.3400 0.5815 4.8 —0.2404 —0.2985 7.8 0.2154 0.2014 
1.9 0.2818 0.5812 4.9 —0.2097 —0.3147 7.9 0.1944 0.2192 
2.0 0.2239 0.5767 5.0 —0.1776 —0.3276 8.0 0.1717 0.2346 
Jel 0.1666 0.5683 5.1 —0.1443 —0.3371 8.1 0.1475 0.2476 
2.2 0.1104 0.5560 D2 —0.1103 —0.3432 8.2 0.1222 0.2580 
2.3 0.0555 0.5399 5.3 —0.0758 —0.3460 8.3 0.0960 0.2657 
2.4 0.0025 0.5202 5.4 —0.0412 —0.3453 8.4 0.0692 0.2708 
2.5 —0.0484 0.4971 5.5 —0.0068 —0.3414 8.5 0.0419 0.2731 
2.6 —0.0968 0.4708 5.6 0.0270 —0.3343 8.6 0.0146 0.2728 
2.7 —0.1424 0.4416 5.7 0.0599 —0.3241 8.7 —0.0125 0.2697 
2.8 —0.1850 0.4097 5.8 0.0917 —0.3110 8.8 —0.0392 0.2641 
2.9 —0.2243 0.3754 5.9 0.1220 —0.2951 8.9 —0.0653 0.2559 


Jo(x) = 0 for x = 2.40483, 5.52008, 8.65373, 11.7915, 14.9309, 18.0711, 21.2116, 24.3525, 27.4935, 30.6346 
J(x) = 0 for x = 3.83171, 7.01559, 10.1735, 13.3237, 16.4706, 19.6159, 22.7601, 25.9037, 29.0468, 32.1897 
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Table Al (continued) 

x Yo(x) Yi(x) x Yo(x) Vix) x Yo(x) Yi(x) 

0.0 (—®) (—®) 2.5 0.498 0.146 5.0 —0.309 0.148 

0.5 —0.445 —1.471 3.0 0.377 0.325 5.5 —0.339 —0.024 

1.0 0.088 —0.781 35) 0.189 0.410 6.0 —0.288 —0.175 

1.5 0.382 —0.412 4.0 —0.017 0.398 6.5 —0.173 —0.274 

2.0 0.510 —0.107 45 —0.195 0.301 7.0 —0.026 —0.303 

Table A2_ Gamma Function [see (24) in App. A3.1] 

a T(a@) a T(a) a T(a) a T(a) a T(a) 
1.00 | 1.000000 || 1.20 | 0.918 169 || 1.40 | 0.887264 |} 1.60 | 0.893515 |) 1.80 | 0.931 384 
1.02 | 0.988 844 || 1.22 | 0.913 106 || 1.42 | 0.886356 || 1.62 | 0.895924 |) 1.82 | 0.936 845 
1.04 | 0.978 438 || 1.24 | 0.908 521 1.44 | 0.885805 || 1.64 | 0.898642 || 1.84 | 0.942 612 
1.06 | 0.968 744 || 1.26 | 0.904397 || 1.46 | 0.885604 || 1.66 | 0.901 668 || 1.86 | 0.948 687 
1.08 | 0.959725 || 1.28 | 0.900718 || 1.48 | 0.885 747 || 1.68 | 0.905 001 1.88 | 0.955 071 
1.10 | 0.951 351 1.30 | 0.897 471 1.50 | 0.886 227 || 1.70 | 0.908 639 |} 1.90 | 0.961 766 
1.12 | 0.943590 |) 1.32 | 0.894640 || 1.52 | 0.887039 || 1.72 | 0.912581 1.92 | 0.968 774 
1.14 | 0.936416 || 1.34 | 0.892216 || 1.54 | 0.888178 || 1.74 | 0.916826 |) 1.94 | 0.976099 
1.16 | 0.929803 || 1.36 | 0.890185 || 1.56 | 0.889639 || 1.76 | 0.921375 || 1.96 | 0.983 743 
1.18 | 0.923728 || 1.38 | 0.888537 || 1.58 | 0.891420 || 1.78 | 0.926227 |) 1.98 | 0.991 708 
1.20 | 0.918 169 || 1.40 | 0.887264 || 1.60 | 0.893515 || 1.80 | 0.931 384 |) 2.00 | 1.000000 

Table A3_ Factorial Function and Its Logarithm with Base 10 
n n! log (n!) n n!} log (n!) n n! log (n!) 
1 1 0.000 000 6 720 2.857 332 11 39 916 800 7.601 156 
2 2 0.301 030 7 5 040 3.702 431 12 479 001 600 8.680 337 
3 6 0.778 151 8 40 320 4.605 521 13 6 227 020 800 9.794 280 
4 24 1.380 211 9 362 880 5.559 763 14 87 178 291 200 10.940 408 
5 120 2.079 181 10 3 628 800 6.559 763 15 1 307 674 368 000 12.116 500 


Table A4_ Error Function, Sine and Cosine Integrals [see (35), (40), (42) in App. A3.1] 


x Gita Si(x) ci(x) x Sat se Si(x) ci(x) 
0.0 0.0000 0.0000 ee) 2.0 0.9953 1.6054 —0.4230 
0.2 0.2227 0.1996 1.0422 2.2 0.9981 1.6876 —0.3751 
0.4 0.4284 0.3965 0.3788 2.4 0.9993 1.7525 —0.3173 
0.6 0.6039 0.5881 0.0223 2.6 0.9998 1.8004 —0.2533 
0.8 0.7421 0.7721 —0.1983 2.8 0.9999 1.8321 —0.1865 
1.0 0.8427 0.9461 —0.3374 3.0 1.0000 1.8487 —0.1196 
1.2 0.9103 1.1080 —0.4205 3.2 1.0000 1.8514 —0.0553 
14 0.9523 1.2562 —0.4620 3.4 1.0000 1.8419 0.0045 
1.6 0.9763 1.3892 —0.4717 3.6 1.0000 1.8219 0.0580 
1.8 0.9891 1.5058 —0.4568 3.8 1.0000 1.7934 0.1038 
2.0 0.9953 1.6054 —0.4230 4.0 1.0000 1.7582 0.1410 
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Table A5 Binomial Distribution 
Probability function f(x) [see (2), Sec. 24.7] and distribution function F(x) 


8100 | 0.8100 || 6400 | 0.6400 |) 4900 | 0.4900 || 3600 | 0.3600 || 2500 | 0.2500 

2) 1 1800 | 0.9900 |} 3200 | 0.9600 || 4200 | 0.9100 || 4800 | 0.8400 || 5000 | 0.7500 
2 || 0100 | 1.0000 |} 0400 | 1.0000 |; 0900 | 1.0000 || 1600 | 1.0000 || 2500 | 1.0000 

0 || 7290 | 0.7290 || 5120 | 0.5120 || 3430 | 0.3430 || 2160 | 0.2160 || 1250 | 0.1250 

3 1 || 2430 | 0.9720 || 3840 | 0.8960 || 4410 | 0.7840 || 4320 | 0.6480 || 3750 | 0.5000 
2 || 0270 | 0.9990 |} 0960 | 0.9920 || 1890 | 0.9730 || 2880 | 0.9360 || 3750 | 0.8750 

3 || 0010 | 1.0000 |} 0080 | 1.0000 |; 0270 | 1.0000 || 0640 | 1.0000 || 1250 | 1.0000 

0 || 6561 | 0.6561 || 4096 | 0.4096 || 2401 | 0.2401 1296 | 0.1296 ||} 0625 | 0.0625 

1 || 2916 | 0.9477 || 4096 | 0.8192 || 4116 | 0.6517 || 3456 | 0.4752 || 2500 | 0.3125 

4 | 2 || 0486 | 0.9963 || 1536 | 0.9728 || 2646 | 0.9163 || 3456 | 0.8208 || 3750 | 0.6875 
3 || 0036 | 0.9999 || 0256 | 0.9984 || 0756 | 0.9919 || 1536 | 0.9744 || 2500 | 0.9375 

4 || 0001 1.0000 |} 0016 | 1.0000 || 0081 1.0000 |; 0256 | 1.0000 |} 0625 | 1.0000 

0 || 5905 | 0.5905 || 3277 | 0.3277 || 1681 | 0.1681 || 0778 | 0.0778 || 0313 | 0.0313 

1 || 3281 | 0.9185 || 4096 | 0.7373 || 3602 | 0.5282 || 2592 | 0.3370 || 1563 | 0.1875 

5 2 || 0729 | 0.9914 || 2048 | 0.9421 || 3087 | 0.8369 || 3456 | 0.6826 || 3125 | 0.5000 
3 || 0081 | 0.9995 |} 0512 | 0.9933 |} 1323 | 0.9692 || 2304 | 0.9130 || 3125 | 0.8125 

4 || 0005 | 1.0000 || 0064 | 0.9997 || 0284 | 0.9976 || 0768 | 0.9898 || 1563 | 0.9688 

5 || 0000 | 1.0000 |} 0003 | 1.0000 |} 0024 | 1.0000 |} 0102 | 1.0000 || 0313 | 1.0000 

0 || 5314 | 0.5314 || 2621 | 0.2621 1176 | 0.1176 || 0467 | 0.0467 || 0156 | 0.0156 

1 || 3543 | 0.8857 || 3932 | 0.6554 || 3025 | 0.4202 || 1866 | 0.2333 || 0938 | 0.1094 

2 || 0984 | 0.9841 |} 2458 | 0.9011 || 3241 | 0.7443 || 3110 | 0.5443 || 2344 | 0.3438 

6 | 3 || 0146 | 0.9987 || 0819 | 0.9830 || 1852 | 0.9295 || 2765 | 0.8208 || 3125 | 0.6563 
4 || 0012 | 0.9999 || 0154 | 0.9984 || 0595 | 0.9891 1382 | 0.9590 || 2344 | 0.8906 

5 || 0001 1.0000 |} 0015 | 0.9999 || 0102 | 0.9993 || 0369 | 0.9959 || 0938 | 0.9844 

6 || 0000 | 1.0000 || 0001 1.0000 |} 0007 | 1.0000 || 0041 1.0000 || 0156 | 1.0000 

0 || 4783 | 0.4783 || 2097 | 0.2097 || 0824 | 0.0824 || 0280 | 0.0280 || 0078 | 0.0078 

1 || 3720 | 0.8503 || 3670 | 0.5767 || 2471 | 0.3294 || 1306 | 0.1586 || 0547 | 0.0625 

2 || 1240 | 0.9743 || 2753 | 0.8520 || 3177 | 0.6471 || 2613 | 0.4199 || 1641 | 0.2266 

7 3 || 0230 | 0.9973 || 1147 | 0.9667 || 2269 | 0.8740 || 2903 | 0.7102 || 2734 | 0.5000 
4 || 0026 | 0.9998 || 0287 | 0.9953 || 0972 | 0.9712 || 1935 | 0.9037 || 2734 | 0.7734 

5 || 0002 | 1.0000 |} 0043 | 0.9996 || 0250 | 0.9962 || 0774 | 0.9812 || 1641 | 0.9375 

6 || 0000 | 1.0000 |} 0004 | 1.0000 |} 0036 | 0.9998 || 0172 | 0.9984 || 0547 | 0.9922 

7 || 0000 | 1.0000 |} 0000 | 1.0000 |; 0002 | 1.0000 |} 0016 | 1.0000 || 0078 | 1.0000 

0 || 4305 | 0.4305 || 1678 | 0.1678 || 0576 | 0.0576 || 0168 | 0.0168 || 0039 | 0.0039 

1 || 3826 | 0.8131 || 3355 | 0.5033 || 1977 | 0.2553 || 0896 | 0.1064 |) 0313 | 0.0352 

2 || 1488 | 0.9619 || 2936 | 0.7969 || 2965 | 0.5518 || 2090 | 0.3154 || 1094 | 0.1445 

3 || 0331 | 0.9950 |} 1468 | 0.9437 || 2541 | 0.8059 || 2787 | 0.5941 || 2188 | 0.3633 

8 | 4 || 0046 | 0.9996 || 0459 | 0.9896 || 1361 | 0.9420 || 2322 | 0.8263 || 2734 | 0.6367 
5 || 0004 | 1.0000 |} 0092 | 0.9988 || 0467 | 0.9887 || 1239 | 0.9502 || 2188 | 0.8555 

6 || 0000 | 1.0000 |} 0011 | 0.9999 |; 0100 | 0.9987 || 0413 | 0.9915 || 1094 | 0.9648 

7 || 0000 | 1.0000 || 0001 1.0000 |} 0012 | 0.9999 || 0079 | 0.9993 || 0313 | 0.9961 

8 || 0000 | 1.0000 || 0000 | 1.0000 || 0001 1.0000 |; 0007 | 1.0000 |} 0039 | 1.0000 
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Table A6é Poisson Distribution 
Probability function f(x) [see (5), Sec. 24.7] and distribution function F(x) 


p=0.1 p= 0.2 p= 03 b= 0.4 p= 0.5 
x || FQ) F(x) FQ) F(x) FQ) F(x) FQ) F(x) FQ) F(x) 
0. 0. 0. 0. 0. 
0 9048 0.9048 8187 0.8187 7408 0.7408 6703 0.6703 6065 | 0.6065 
1 0905 0.9953 1637 0.9825 2222 0.9631 2681 0.9384 3033 | 0.9098 
2 0045 0.9998 0164 | 0.9989 0333 0.9964 0536 0.9921 0758 | 0.9856 
3 0002 1.0000 || 0011 0.9999 0033 0.9997 0072 0.9992 || 0126 | 0.9982 
4 0000 1.0000 || 0001 1.0000 || 0003 1.0000 0007 0.9999 0016 | 0.9998 
> 0001 1.0000 || 0002 1.0000 
p= 0.6 p= 0.7 p= 08 p=0.9 w= 
x || FQ) F(x) FQ) F(x) FQ) F(x) f@) F(x) FQ) F(x) 
0. 0. 0. 0. 0. 
0 5488 0.5488 4966 0.4966 4493 0.4493 4066 0.4066 3679 | 0.3679 
1 3293 0.8781 3476 0.8442 3595 0.8088 3659 0.7725 3679 | 0.7358 
2 0988 0.9769 1217 0.9659 1438 0.9526 1647 0.9371 1839 | 0.9197 
3 0198 0.9966 |} 0284 | 0.9942 0383 0.9909 0494 0.9865 0613 | 0.9810 
4 0030 0.9996 || 0050 | 0.9992 0077 0.9986 O111 0.9977 0153 | 0.9963 
5 0004 1.0000 || 0007 0.9999 0012 0.9998 0020 0.9997 0031 | 0.9994 
6 0001 1.0000 |} 0002 1.0000 0003 1.0000 || 0005 | 0.9999 
7 0001 1.0000 
f= US (i= p= 3 Ls (i= 
x || FQ) F(x) FQ) F(x) FQ) F(x) FQ) F(x) FQ) Fx) 
0. 0. 0. 0. 0. 
0 || 2231 0.2231 1353 0.1353 0498 0.0498 0183 0.0183 0067 | 0.0067 
1 || 3347 0.5578 2707 0.4060 1494 0.1991 0733 0.0916 || 0337 | 0.0404 
2 || 2510 0.8088 2707 0.6767 2240 | 0.4232 1465 0.2381 0842 | 0.1247 
3 || 1255 0.9344 1804 | 0.8571 2240 | 0.6472 1954 0.4335 1404 | 0.2650 
4 || 0471 0.9814 |} 0902 0.9473 1680 | 0.8153 1954 0.6288 1755 | 0.4405 
5 || 0141 0.9955 0361 0.9834 1008 0.9161 1563 0.7851 1755 | 0.6160 
6 || 0035 0.9991 0120 | 0.9955 0504 0.9665 1042 0.8893 1462 | 0.7622 
7 || 0008 0.9998 0034 | 0.9989 0216 0.9881 0595 0.9489 1044 | 0.8666 
8 || 0001 1.0000 || 0009 0.9998 0081 0.9962 0298 0.9786 || 0653 | 0.9319 
9 0002 1.0000 |} 0027 0.9989 0132 0.9919 0363 | 0.9682 
10 0008 0.9997 0053 0.9972 || 0181 | 0.9863 
11 0002 0.9999 0019 0.9991 0082 | 0.9945 
12 0001 1.0000 0006 0.9997 0034 | 0.9980 
13 0002 0.9999 0013 | 0.9993 
14 0001 1.0000 || 0005 | 0.9998 
15 0002 | 0.9999 
16 0000 1.0000 
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Table A7_ Normal Distribution 
Values of the distribution function ®(z) [see (3), Sec. 24.8]. ®(—z) = 1 — P(z) 
Z D(z) z PD) z D(z) z D(z) z D(z) i D(z) 
0. 0. 0. 0. 0. 0. 

0.01 5040 0.51 6950 1.01 8438 51 9345 2.01 9778 2.51 9940 
0.02 5080 0:52: 6985 1.02 8461 1:52 9357 2.02 9783 2.52 9941 
0.03 5120 0.53 7019 1.03 8485 1.53 9370 2.03 9788 2:53 9943 
0.04 5160 0.54 7054 1.04 8508 1.54 9382 2.04 9793 2.54 9945 
0.05 5199 0.55 7088 1.05 8531 1.55 9394 2.05 9798 2.55 9946 
0.06 5239 0.56 7123 1.06 8554 1.56 9406 2.06 9803 2.56 9948 
0.07 5279 0.57 7157 1.07 8577 1.57 9418 2.07 9808 257 9949 
0.08 5319 0.58 7190 1.08 8599 1.58 9429 2.08 9812 2.58 9951 
0.09 5359 0.59 7224 1.09 8621 1.59 9441 2.09 9817 2.59 9952 
0.10 5398 0.60 7257 1.10 8643 1.60 9452 2.10 9821 2.60 9953 
0.11 5438 0.61 7291 1.11 8665 1.61 9463 2.11 9826 2.61 9955 
0.12 5478 0.62 7324 1,12 8686 1.62 9474 2.12 9830 2.62 9956 
0.13 5517 0.63 7357 1.13 8708 1.63 9484 2.13 9834 2.63 9957 
0.14 5557 0.64 7389 1.14 8729 1.64 9495 2.14 9838 2.64 9959 
0.15 5596 0.65 7422 1.15 8749 1.65 9505 2135 9842 2.65 9960 
0.16 5636 0.66 7454 1.16 8770 1.66 9515 2.16 9846 2.66 9961 
0.17 5675 0.67 7486 1.17 8790 1.67 9525 217 9850 2.67 9962 
0.18 5714 0.68 7517 1.18 8810 1.68 9535 2.18 9854 2.68 9963 
0.19 5753 0.69 7549 1.19 8830 1.69 9545 2.19 9857 2.69 9964 
0.20 5793 0.70 7580 1.20 8849 1.70 9554 2.20 9861 2.70 9965 
O21 5832 0.71 7611 1.21 8869 LZ 9564 2.21 9864 2.71 9966 
0.22 5871 0.72 7642 1,22 8888 1.72 9573 2.22 9868 2.72 9967 
0.23 5910 0.73 7673 1.23 8907 1.73 9582 2.23 9871 2.73 9968 
0.24 5948 0.74 7704 1.24 8925 1.74 9591 2.24 9875 2.74 9969 
0.25 5987 0.75 7734 1,25 8944 195 9599 225 9878 2.15 9970 
0.26 6026 0.76 77164 1.26 8962 1.76 9608 2.26 9881 2.76 9971 
0.27 6064 0.77 7794 127 8980 1.77 9616 227 9884 2.77 9972 
0.28 6103 0.78 7823 1.28 8997 1.78 9625 2.28 9887 2.78 9973 
0.29 6141 0.79 7852 1.29 9015 1.79 9633 2.29 9890 2.79 9974 
0.30 6179 0.80 7881 1.30 9032 1.80 9641 2.30 9893 2.80 9974 
0.31 6217 0.81 7910 1.31 9049 1.81 9649 2.31 9896 2.81 9975 
0.32 6255 0.82 7939 1.32 9066 1.82 9656 2.32 9898 2.82 9976 
0.33 6293 0.83 7967 1.33 9082 1.83 9664 2.33 9901 2.83 9977 
0.34 6331 0.84 7995 1.34 9099 1.84 9671 2.34 9904 2.84 9977 
0.35 6368 0.85 8023 1.35 9115 1.85 9678 2.35 9906 2.85 9978 
0.36 6406 0.86 8051 1.36 9131 1.86 9686 2.36 9909 2.86 9979 
0.37 6443 0.87 8078 1.37 9147 1.87 9693 2.37 9911 2.87 9979 
0.38 6480 0.88 8106 1.38 9162 1.88 9699 2.38 9913 2.88 9980 
0.39 6517 0.89 8133 1.39 9177 1.89 9706 2.39 9916 2.89 9981 
0.40 6554 0.90 8159 1.40 9192 1.90 9713 2.40 9918 2.90 9981 
0.41 6591 0.91 8186 1.41 9207 1.91 9719 2.41 9920 2.91 9982 
0.42 6628 0.92 8212 1.42 9222 1.92 9726 2.42 9922 2.92 9982 
0.43 6664 0.93 8238 1.43 9236 1.93 9732 2.43 9925 2.93 9983 
0.44 6700 0.94 8264 1.44 9251 1.94 9738 2.44 9927 2.94 9984 
0.45 6736 0.95 8289 1.45 9265 1.95 9744 2.45 9929 2.95 9984 
0.46 6772 0.96 8315 1.46 9279 1.96 9750 2.46 9931 2.96 9985 
0.47 6808 0.97 8340 1.47 9292 1.97 9756 247 9932 2.97 9985 
0.48 6844 0.98 8365 1.48 9306 1.98 9761 2.48 9934 2.98 9986 
0.49 6879 0.99 8389 1.49 9319 1.99 9767 2.49 9936 2.99 9986 
0.50 6915 1.00 8413 1.50 9332 2.00 9772 2.50 9938 3.00 9987 
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Table A8 Normal Distribution 


Values of z for given values of ®(z) [see (3), Sec. 24.8] and D(z) = ®(z) — ®(—z) 
Example: z = 0.279 if B(z) = 61%; z = 0.860 if D(z) = 61%. 


% z(P) z(D) % 2(@) z(D) % 2(@) z(D) 
1 —2.326 0.013 41 —0.228 0.539 81 0.878 1.311 
2 —2.054 0.025 42 —0.202 0.553 82 0.915 1.341 
3 —1.881 0.038 43 —0.176 0.568 83 0.954 1.372 
4 = [751 0.050 44 —0.151 0.583 84 0.994 1.405 
5 — 1.645 0.063 45 —0.126 0.598 85 1.036 1.440 
6 —1.555 0.075 46 —0.100 0.613 86 1.080 1.476 
7 —1.476 0.088 47 —0.075 0.628 87 1.126 1.514 
8 — 1.405 0.100 48 —0.050 0.643 88 1.175 1.555 
9 —1.341 0.113 49 —0.025 0.659 89 1.227 1.598 

10 =1,282 0.126 50 0.000 0.674 90 1.282 1.645 

11 —1.227 0.138 of 0.025 0.690 91 1.341 1.695 

12 —1.175 0.151 52 0.050 0.706 92 1.405 L731 

13 —1.126 0.164 53 0.075 0.722 93 1.476 1.812 

14 — 1.080 0.176 54 0.100 0.739 94 1.555 1.881 

15 — 1.036 0.189 55 0.126 0.755 95 1.645 1.960 

16 —0.994 0.202 56 0.151 0.772 96 1.751 2.054 

17 —0.954 0.215 37 0.176 0.789 97 1.881 2.170 

18 —0.915 0.228 58 0.202 0.806 97.5 1.960 2.241 

19 —0.878 0.240 59 0.228 0.824 98 2.054 2.326 

20 —0.842 0.253 60 0.253 0.842 99 2.326 2.576 

21 —0.806 0.266 61 0.279 0.860 99.1 2.366 2.612 

22 —0.772 0.279 62 0.305 0.878 99.2 2.409 2.652 

23 —0.739 0.292 63 0.332 0.896 99.3 2.457 2.697 

24 —0.706 0.305 64 0.358 0.915 99.4 2512 2.748 

25 —0.674 0.319 65 0.385 0.935 99.5 2.576 2.807 

26 —0.643 0.332 66 0.412 0.954 99.6 2.652 2.878 

27 —0.613 0.345 67 0.440 0.974 99.7 2.748 2.968 

28 —0.583 0.358 68 0.468 0.994 99.8 2.878 3.090 

29 —0.553 0.372 69 0.496 1.015 99.9 3.090 3.291 

30 —0.524 0.385 70 0.524 1.036 

31 —0.496 0.399 71 0.553 1.058 99.91 3.121 3.320 

32 —0.468 0.412 72 0.583 1.080 99.92 3.156 3.353 

33 —0.440 0.426 73 0.613 1.103 99.93 3.195 3.390 

34 —0,412 0.440 74 0.643 1.126 99.94 3.239 3.432 

35 —0.385 0.454 75 0.674 1.150 99.95 3.291 3.481 

36 —0.358 0.468 76 0.706 1.175 99.96 3.353 3.540 

37 —0.332 0.482 77 0.739 1.200 99.97 3.432 3.615 

38 —0.305 0.496 78 0.772 1.227 99.98 3.540 3.719 

39 —0.279 0.510 79 0.806 1.254 99.99 3.719 3.891 

40 —0.253 0.524 80 0.842 1.282 
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Table AQ t-Distribution 
Values of z for given values of the distribution function F(z) (see (8) in Sec. 25.3). 
Example: For 9 degrees of freedom, z = 1.83 when F(z) = 0.95. 
Number of Degrees of Freedom 
F(z) 
1 2 3 4 Pe) 6 7 8 9 10 
0.5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 
0.6 0.32 0.29 0.28 0.27 0.27 0.26 0.26 0.26 0.26 0.26 
0.7 0.73 0.62 0.58 0.57 0.56 0.55 0.55 0.55 0.54 0.54 
0.8 1.38 1.06 0.98 0.94 0.92 0.91 0.90 0.89 0.88 0.88 
0.9 3.08 1.89 1.64 1.53 1.48 1.44 1.41 1.40 1.38 1.37 
0.95 6.31 2.92 2.35 2.13 2.02 1.94 1.89 1.86 1.83 1.81 
0.975 12.7 4.30 3.18 2.78 2.57 2.45 2.36 2:31 2.26 2.23 
0.99 31.8 6.96 4.54 3.75 3.36 3.14 3.00 2.90 2.82 2.76 
0.995 63.7 9.92 5.84 4.60 4.03 3.71 3.50 3.36 3:25 3.17 
0.999 318.3 22.3 10.2 TAT 5.89 5.21 4.79 4.50 4.30 4.14 
Number of Degrees of Freedom 
F(z) 
11 2 13 14 15 16 7 18 19 20 
0.5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 
0.6 0.26 0.26 0.26 0.26 0.26 0.26 0.26 0.26 0.26 0.26 
0.7 0.54 0.54 0.54 0.54 0.54 0.54 0.53 0.53 0.53 0.53 
0.8 0.88 0.87 0.87 0.87 0.87 0.86 0.86 0.86 0.86 0.86 
0.9 1.36 1.36 1.35 1.35 1.34 1.34 1.33 1.33 1.33 1.33 
0.95 1.80 1.78 1.77 1.76 1.75 LTS 1.74 1:73 3 1,72 
0.975 2.20 2.18 2.16 2.14 2.13 2.12 2.11 2.10 2.09 2.09 
0.99 2/2 2.68 2.65 2.62 2.60 2.58 2.57 2.55 2.54 2.93 
0.995 3.11 3.05 3.01 2.98 2.95 2:92. 2.90 2.88 2.86 2.85 
0.999 4.02 3.93 3.85 3.79 313 3.69 3.65 3.61 3.58 3.55) 
Number of Degrees of Freedom 
F(z) 
22 24 26 28 30 40 50 100 200 lo) 
0.5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 
0.6 0.26 0.26 0.26 0.26 0.26 0.26 0.25 0.25 0.25 0.25 
0.7 0.53 0.53 0.53 0.53 0.53 0.53 0.53 0.53 0.53 0.52 
0.8 0.86 0.86 0.86 0.85 0.85 0.85 0.85 0.85 0.84 0.84 
0.9 1.32 1.32 1.31 1.31 1:31 1.30 1.30 1.29 1.29 1.28 
0.95 1.72 1.71 17 1.70 1.70 1.68 1.68 1.66 1.65 1.65 
0.975 2.07 2.06 2.06 2.05 2.04 2.02 2.01 1.98 1.97 1.96 
0.99 2.51 2.49 2.48 2.47 2.46 2.42 2.40 2.36 2.35 2.33 
0.995 2.82 2.80 2.78 2.76 2.75 2.70 2.68 2.63 2.60 2.58 
0.999 3.50 3.47 3.43 3.41 3.39 3.31 3.26 3.17 3.13 3.09 
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Table Al0 Chi-square Distribution 


Values of x for given values of the distribution function F(z) (see Sec. 25.3 before (17)). 
Example: For 3 degrees of freedom, z = 11.34 when F(z) = 0.99. 


Number of Degrees of Freedom 


F(z) 

1 p 3 4 5 6 i 8 9 10 
0.005 0.00 0.01 0.07 0.21 0.41 0.68 0.99 1.34 1,73 2.16 
0.01 0.00 0.02 0.11 0.30 0.55 0.87 1.24 1.65 2.09 2.56 
0.025 0.00 0.05 0.22 0.48 0.83 1.24 1.69 2.18 2.70 3.25 
0.05 0.00 0.10 0.35 0.71 1,15 1.64 2 2.73 3.33 3.94 
0.95 3.84 5.99 7.81 9.49 | 11.07 | 12.59 | 14.07 | 15.51 16.92 | 18.31 
0.975 5.02 7.38 0.35 11.14 | 12.83 | 1445 | 16.01 | 17.53 | 19.02 | 20.48 
0.99 6.63 ol 11.34 13.28 15.09 | 16.81 18.48 | 20.09 | 21.67 | 23.21 
0.995 7.88 10.60 12.84 14.86 | 16.75 | 18.55 | 20.28 | 21.95 | 23.59 | 25.19 

Number of Degrees of Freedom 

F(z) 

11 12 13 14 15 16 7) 18 19 20 
0.005 2.60 3.07 357 4.07 4.60 5.14 5.70 6.26 6.84 7.43 
0.01 3.05 SDT 4.11 4.66 5.23 5.81 6.41 7.01 7.63 8.26 
0.025 3.82 4.40 5.01 5.63 6.26 6.91 7.56 8.23 8.91 9.59 
0.05 4.57 5.23 5.89 6.57 7.26 7.96 8.67 9.39 | 10.12 | 10.85 
0.95 19.68 21.03 22.36 23.68 | 25.00 | 26.30 | 27.59 | 28.87 | 30.14 | 31.41 


0.975 21.92 23.34 24.74 26.12 | 27.49 | 28.85 | 30.19 | 31.53 | 32.85 | 34.17 
0.99 24.72 26.22 27.69 29.14 | 30.58 | 32.00 | 33.41 34.81 | 36.19 | 37.57 
0.995 26.76 28.30 29.82 31.32 | 32.80 | 34.27 | 35.72 | 37.16 | 38.58 | 40.00 


Number of Degrees of Freedom 


F(z) 

21 mp) 23 24 25 26 Ml 28 29 30 
0.005 8.0 8.6 9.3 9.9 10.5 11.2 11.8 12.5 13.1 13.8 
0.01 8.9 9.5 10.2 10.9 11.5 12.2 12.9 13.6 14.3 15.0 
0.025 10.3 11.0 11.7 12.4 13.1 13.8 14.6 15.3 16.0 16.8 
0.05 11.6 12.3 13.1 13.8 14.6 15.4 16.2 16.9 17.7 18.5 
0.95 32.7 33.9 35.2 36.4 37.7 38.9 40.1 413 | 426 | 43.8 
0.975 35.5 36.8 38.1 39.4 40.6 41.9 43.2 44.5 | 45.7 | 47.0 
0.99 38.9 40.3 41.6 43.0 44.3 45.6 47.0 483 | 496 | 50.9 
0.995 41.4 42.8 44.2 45.6 46.9 48.3 49.6 51.0 | 52.3 | 53.7 

Number of Degrees of Freedom 

F(z) 

40 50 60 70 80 90 100 > 100 (Approximation) 
0.005 | 20.7 28.0 35.5 43.3 51.2 59.2 67.3 3(h — 2.58)? 
0.01 22.2 29.7 37.5 45.4 53.5 61.8 70.1 3(h — 2.33)" 
0.025 | 24.4 32.4 40.5 48.8 57.2 65.6 74.2 3(h — 1.96)" 
0.05 26.5 34.8 43.2 51.7 60.4 69.1 71.9 3(h — 1.64)? 
0.95 55.8 67.5 79.1 90.5 101.9 113.1 124.3 3(h + 1.64)? 
0.975 | 59.3 71.4 83.3 95.0 106.6 118.1 129.6 3(h + 1.96)? 
0.99 63.7 76.2 88.4 100.4 112.3 124.1 135.8 3(h + 2.33)" 
0.995 | 66.8 79.5 92.0 104.2 116.3 128.3 140.2 3(h + 2.58)" 


In the last column, h = V2m — 1, where m is the number of degrees of freedom. 
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Table All F-Distribution with (m, n) Degrees of Freedom 


Values of z for which the distribution function F(z) [see (13), Sec. 25.4] has the value 0.95 
Example: For (7, 4) d.f., z = 6.09 if F(z) = 0.95. 


n m=1 m=2 m = 3 m=4 m=5 m= m= m= 8 m=9 
1 161 200 216 225 230 234 237 239 241 
2 18.5 19.0 19.2 19.2 19.3 19.3 19.4 19.4 19.4 

3 10.1 9.55 9.28 9.12 9.01 8.94 8.89 8.85 8.81 
4 771 6.94 6.59 6.39 6.26 6.16 6.09 6.04 6.00 
5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 4.77 
6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 4.10 
7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 3.68 

8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 3.39 
9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 3.18 
10 4.96 4.10 3.71 3.48 3.33 3.22 3.14 3.07 3.02 
11 4.84 3.98 3.59 3.36 3.20 3.09 3.01 2.95 2.90 
12 4.75 3.89 3.49 3.26 3.11 3.00 2.91 2.85 2.80 
13 4.67 3.81 3.41 3.18 3.03 2.92 2.83 DTT 2.71 
14 4.60 3.74 3.34 3.11 2.96 2.85 2.76 2.70 2.65 
15 4.54 3.68 3.29 3.06 2.90 2.79 2.71 2.64 2.59 
16 4.49 3.63 3.24 3.01 2.85 2.74 2.66 2.59 2.54 
17 4.45 3.59 3.20 2.96 2.81 2.70 2.61 295 2.49 
18 4.41 355 3.16 2.93 2.77 2.66 2.58 2:51 2.46 
19 4.38 3.52 3.13 2.90 2.74 2.63 2.54 2.48 2.42 
20 4.35 3.49 3.10 2.87 2.71 2.60 2.51 2.45 2.39 
22 4.30 3.44 3.05 2.82 2.66 2.55 2.46 2.40 2.34 
24 4.26 3.40 3.01 2.78 2.62 2.51 2.42 2.36 2.30 
26 4.23 3.37 2.98 2.74 2.59 2.47 2.39 2:32, 2.27 
28 4.20 3.34 2.95 2.71 2.56 2.45 2.36 2.29 2.24 
30 4.17 3,32 2.92 2.69 2.53 2.42 2.33 2.27 2.21 
32 4.15 3.29 2.90 2.67 251 2.40 2.31 2.24 2.19 
34 4.13 3.28 2.88 2.65 2.49 2.38 2.29 2:23 217 
36 4.11 3.26 2.87 2.63 2.48 2.36 2.28 2.21 2.15 
38 4.10 3.24 2.85 2.62 2.46 2.35 2.26 2.19 2.14 
40 4.08 3.23 2.84 2.61 2.45 2.34 2.25 2.18 2.12 
50 4.03 3.18 2.79 2.56 2.40 2.29 2.20 2.13 2.07 
60 4.00 3.15 2.76 2.53 2.37 2.25 2.17 2.10 2.04 
70 3.98 3.13 2.74 2.50 2.35 2.23 2.14 2.07 2.02 
80 3.96 3.11 2.72 2.49 2.33 2.21 2.13 2.06 2.00 
90 3.95 3.10 2.71 2.47 2.32 2.20 2.11 2.04 1.99 
100 3.94 3.09 2.70 2.46 2.31 2.19 2.10 2.03 1.97 
150 3.90 3.06 2.66 2.43 2.27 2.16 2.07 2.00 1.94 
200 3.89 3.04 2.65 2.42 2.26 2.14 2.06 1.98 1.93 
1000 3.85 3.00 2.61 2.38 2.22 2.11 2.02 1.95 1.89 
oo 3.84 3.00 2.60 2.37 2,21 2.10 2.01 1.94 1.88 
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Table All F-Distribution with (m, n) Degrees of Freedom (continued) 


Values of z for which the distribution function F(z) [see (13), Sec. 25.4] has the value 0.95 


n m = 10 m= 15 m = 20 m = 30 m = 40 m = 50 m = 100 ie) 
1 242 246 248 250 251 252 253 254 
2, 19.4 19.4 19.4 19.5 19.5 19.5 19.5 19.5 

3 8.79 8.70 8.66 8.62 8.59 8.58 8.55 8.53 
4 5.96 5.86 5.80 i) 5.72 5.70 5.66 5.63 
5 4.74 4.62 4.56 4.50 4.46 4.44 441 4.37 
6 4.06 3.94 3.87 3.81 ST 3.75 3.71 3.67 
7 3.64 3.51 3.44 3.38 3.34 3.32 3.27 3.23 
8 3.35 3.22 3.15 3.08 3.04 3.02 2.97 2.93 
9 3.14 3.01 2.94 2.86 2.83 2.80 2.76 2:71 
10 2.98 2.85 2.77 2.70 2.66 2.64 2.59 2.54 
11 2.85 212 2.65 2.57 2.53 2.51 2.46 2.40 
12 2.75 2.62 2.54 2.47 2.43 2.40 2.35 2.30 
13 2.67 253 2.46 2.38 2.34 2.31 2.26 2.21 
14 2.60 2.46 2.39 2.31 2.27 2.24 2.19 2.13 
15 2.54 2.40 2.33 2.25 2.20 2.18 2,12 2.07 
16 2.49 2.35 2.28 2.19 2.15 2:12 2.07 2.01 
17 2.45 2.31 2.23 2.15 2.10 2.08 2.02 1.96 
18 2.41 227 2.19 2.11 2.06 2.04 1.98 1.92 
19 2.38 2.23 2.16 2.07 2.03 2.00 1.94 1.88 
20 2.35 2.20 212 2.04 1.99 1.97 1.91 1.84 
22 2.30 2315 2.07 1.98 1.94 1.91 1.85 1.78 
24 225 2.11 2.03 1.94 1.89 1.86 1.80 173: 
26 2.22 2.07 1.99 1.90 1.85 1.82 1.76 1.69 
28 2.19 2.04 1.96 1.87 1.82 1.79 1.73 1.65 
30 2.16 2.01 1.93 1.84 1.79 1.76 1.70 1.62 
32 2.14 1.99 1.91 1.82 Lay 1.74 1.67 1.59 
34 2.12 1.97 1.89 1.80 175. 1:71 1.65 1.57 
36 2.14 1.95 1.87 1.78 1.73 1.69 1.62 155 
38 2.09 1.94 1.85 1.76 1.71 1.68 1.61 153 
40 2.08 1.92 1.84 1.74 1.69 1.66 1.59 1.51 
50 2.03 1.87 1.78 1.69 1.63 1.60 1.52 1.44 
60 1.99 1.84 1.75 1.65 1.59 1.56 1.48 1.39 
70 1.97 1.81 1.72 1.62 1.57 1.53 1.45 1.35 
80 1.95 1.79 1.70 1.60 1.54 151 1.43 1.32 
90 1.94 1.78 1.69 1.59 1.53 1.49 1.41 1.30 
100 1.93 Le 1.68 1.57 1.52 1.48 1.39 1.28 
150 1.89 1:73 1.64 1.54 1.48 1.44 1.34 1.22 
200 1.88 1.72 1.62 1.52 1.46 1.41 1.32 1.19 
1000 1.84 1.68 1.58 1.47 1.41 1.36 1.26 1.08 
oo 1.83 1.67 1.57 1.46 1.39 1:35 1.24 1.00 
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Table All F-Distribution with (m, n) Degrees of Freedom (continued) 
Values of z for which the distribution function F(z) [see (13), Sec. 25.4] has the value 0.99 
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n m=1 m=2 m = 3 m=4 m=5 m=6 m=7 m= 8 m=9 
1 | 4052 4999 5403 5625 5764 5859 5928 5981 6022 
2. 98.5 99.0 99.2 99.2 99.3 99.3 99.4 99.4 99.4 
3 34.1 30.8 29.5 28.7 28.2 27.9 27.7 27.5 213 
4 21.2 18.0 16.7 16.0 15.5 15.2 15.0 14.8 14.7 
5 16.3 13.3 12.1 11.4 11.0 10.7 10.5 10.3 10.2 
6 137 10.9 9.78 9.15 8.75 8.47 8.26 8.10 7.98 
7 12.2 9.55 8.45 7.85 7.46 7.19 6.99 6.84 6.72 
8 11.3 8.65 7.59 7.01 6.63 6.37 6.18 6.03 5.91 
9 10.6 8.02 6.99 6.42 6.06 5.80 5.61 5.47 3.35 
10 10.0 7.56 6.55 5.99 5.64 5.39 5.20 5.06 4.94 
11 9.65 7.21 6.22 5.67 5.32 5.07 4.89 4.74 4.63 
12. 9.33 6.93 5.95 5.41 5.06 4.82 4.64 4.50 4.39 
13 9.07 6.70 5.74 5.21 4.86 4.62 4.44 4.30 4.19 
14 8.86 6.51 5.56 5.04 4.69 4.46 4.28 4.14 4.03 
15 8.68 6.36 5.42 4.89 4.56 4.32 4.14 4.00 3.89 
16 8.53 6.23 5.29 4.77 4.44 4.20 4.03 3.89 3.78 
17 8.40 6.11 5.18 4.67 4.34 4.10 3.93 3.79 3.68 
18 8.29 6.01 5.09 4.58 4.25 4.01 3.84 3.71 3.60 
19 8.18 5.93 5.01 4.50 4.17 3.94 3.77 3.63 3.52 
20 8.10 5.85 4.94 4.43 4.10 3.87 3.70 3.56 3.46 
22 7.95 5.72 4.82 4.31 3.99 3.76 3.59 3.45 3.35 
24 7.82 5.61 4.72 4.22 3.90 3.67 3.50 3.36 3.26 
26 V2 5.53 4.64 4.14 3.82 3.59 3.42 3.29 3.18 
28 7.64 5.45 4.57 4.07 3.75 3.53 3.36 3:23 3.12 
30 7.56 5.39 4.51 4.02 3.70 3.47 3.30 3.17 3.07 
32 7.50 5.34 4.46 3.97 3.65 3.43 3.26 3.13 3.02 
34 TA4 5.29 4.42 3.93 3.61 3.39 3.22 3.09 2.98 
36 740 5.25 4.38 3.89 3.57 3.35 3.18 3.05 2.95 
38 7.35 5.21 4.34 3.86 3.54 3.32 3.15 3.02 2.92 
40 731 5.18 4.31 3.83 31 3.29 3.12 2.99 2.89 
50 LAT 5.06 4.20 3.72 3.41 3.19 3.02 2.89 2.78 
60 7.08 4.98 4.13 3.65 3.34 3.12 2.95 2.82 2:72 
70 7.01 4.92 4.07 3.60 3.29 3.07 2.91 2.78 2.67 
80 6.96 4.88 4.04 3.56 3.26 3.04 2.87 2.74 2.64 
90 6.93 4.85 4.01 3.54 3.23 3.01 2.84 2.12 2.61 
100 6.90 4.82 3.98 3.51 3.21 2.99 2.82 2.69 2.59 
150 6.81 4.75 3.91 3.45 3.14 2.92 2.76 2.63 253 
200 6.76 4.71 3.88 3.41 3.11 2.89 2.73 2.60 2.50 
1000 6.66 4.63 3.80 3.34 3.04 2.82 2.66 2.53 2.43 
oo 6.63 4.61 3.78 3332 3.02 2.80 2.64 251 2.41 
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Table All F-Distribution with (m, n) Degrees of Freedom (continued) 


Values of z for which the distribution function F(z) [see (13), Sec. 25.4] has the value 0.99 


n m = 10 m= 15 m = 20 m = 30 m = 40 m = 50 m = 100 fe) 
1} 6056 6157 6209 6261 6287 6303 6334 6366 
2 99.4 99.4 99.4 99.5 99.5 99.5 99.5 99.5 
3 21.2 26.9 26.7 26.5 26.4 26.4 26.2 26.1 
4 14.5 14.2 14.0 13.8 13.7 13.7 13.6 13.5 
5 10.1 9.72 9.55 9.38 9.29 9.24 9.13 9.02 
6 7.87 7.56 7.40 423 7.14 7.09 6.99 6.88 
yi 6.62 6.31 6.16 5.99 5.91 5.86 5.75 5.65 
8 5.81 5.52 5.36 5.20 5.12 5.07 4.96 4.86 
9 5.26 4.96 4.81 4.65 4.57 4.52 4.42 4.31 
10 4.85 4.56 4.41 4.25 4.17 4.12 4.01 3.91 
11 4.54 4.25 4.10 3.94 3.86 3.81 3.71 3.60 
12 4.30 4.01 3.86 3.70 3.62 3.57 3.47 3.36 
13 4.10 3.82 3.66 3.51 3.43 3.38 3.27 3.17 
14 3.94 3.66 31 3.35 3.27 3.22 3.11 3.00 
15 3.80 3.52 3.37 3.21 3.13 3.08 2.98 2.87 
16 3.69 3.41 3.26 3.10 3.02 2.97 2.86 2.75 
17 3.59 3:31 3.16 3.00 2.92 2.87 2.76 2.65 
18 3.51 3.23 3.08 2.92 2.84 2.78 2.68 2.57 
19 3.43 3.15: 3.00 2.84 2.76 2.71 2.60 2.49 
20 3337 3.09 2.94 2.78 2.69 2.64 2.54 2.42 
22 3.26 2.98 2.83 2.67 2.58 2.53 2.42 2.31 
24 3.17 2.89 2.74 2.58 2.49 2.44 2.33 2.21 
26 3.09 2.81 2.66 2.50 2.42 2.36 2.25 2.13 
28 3.03 215 2.60 2.44 2.35 2.30 2.19 2.06 
30 2.98 2.70 2.55 2.39 2.30 2.25 2:13 2.01 
32 2.93 2.65 2.50 2.34 2.25 2.20 2.08 1.96 
34 2.89 2.61 2.46 2.30 2.21 2.16 2.04 1.91 
36 2.86 2.58 2.43 2:26 2.18 2.12 2.00 1.87 
38 2.83 2.55 2.40 2.23 2.14 2.09 1.97 1.84 
40 2.80 2.52 2.37 2.20 2.11 2.06 1.94 1.80 
50 2.70 2.42 227 2.10 2.01 1.95 1.82 1.68 
60 2.63 2.35 2.20 2.03 1.94 1.88 1.75 1.60 
70 2.59 2.31 2.15 1.98 1.89 1.83 1.70 1.54 
80 2.95 2.27 2.12 1.94 1.85 1.79 1.65 1.49 
90 2352 2.24 2.09 1.92 1.82 1.76 1.62 1.46 
100 2.50 2.22 2.07 1.89 1.80 1.74 1.60 1.43 
150 2.44 2.16 2.00 1.83 1.73 1.66 1.52 1.33 
200 2.41 2.13 1.97 1.79 1.69 1.63 1.48 1.28 
1000 2.34 2.06 1.90 12 1.61 1.54 1.38 1.11 
oo 2732; 2.04 1.88 1.70 1.59 1.52 1.36 1.00 
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Table Al2 Distribution Function F(x) = P(T S x) of the Random Variable T in 
Section 25.8 


n n n n n n n 
x | =3 x|/= x | =5 x|/= x | =7 x | =8 3 || =o) x |=10 x |=11 
0. 0 0. 0. 0. 0. 0. 0. 0. 
167} | 0/042] | 0/008] | 0 | 001 1/001 2/ 001 4| 001 6 | 001 8 | 001 
1/500} | 1/167] | 1) 042] | 1 | 008 2 | 005 3 | 003 5| 003 7| 002 9 | 002 
2|375| |2]|117) | 2 | 028 3/015 4] 007 6 | 006 8| 005} | 10| 003 
Fi 3 | 242) | 3 | 068 4} 035 5| 016 7/012 9/008 | | 11| 005 
Paar 4/408} | 4 | 136 5 | 068 6| 031 8/022) |10|/014| | 12] 008 
r 5 | 235 6| 119 7| 054 9} 038) | 11/023} | 13/013 
0. eae 6 | 360 7/191 8|089| | 10/060} | 12/036) | 14] 020 
50/001 7 | 500 8 | 281 9) 138} |11}090} | 13|054} | 15| 030 
51 | 002 0. 9/386) | 10/199} | 12) 130] |14]078| | 16| 043 
52 002 43/001 & 10 | 500 11] 274 13 | 179 15 | 108 17| 060 
53 | 003 44 | 002 =e 12 | 360 14} 238 16 | 146 18 | 082 
54 004 45 | 002 13 | 452 15 | 306 17 | 190 19| 109 
55 | 005 46 | 003 0. z 16 | 381 18 | 242 20) 141 
56 | 006 47 | 003 38 | 001 Polena 17 | 460 19 | 300 21| 179 
57} 007} |48|004| | 39/002 20 | 364 | | 22) 223 
58 | 008 | | 49/005) | 40) 003 0. 21) 431} | 23) 271 
59]010| |50]006| |41/003} /32]001] | . |=46 a2) 500) | 24).324 
60/012| |51|008| |42|004} |33| 002 25 | 381 
61/014) |52]010| |43|005} | 34] 002 0. ji 26 | 440 
62|017| |53}012| |44/007| |35}003| |27/ 001} | , |~45 27) 500 
63/020) |54/014| | 45/009] | 36/004) | 28] 002 
64/023} |55|017| |46/011| |37/005| | 29} 002 0. Ba a 
65|027| |56|021| |47/ 013} |38|007| |30| 003) | 23/001 
66 | 032| |57|025| |48/016| |39/009} |31| 004} | 24 | 002 0. 
67/037 | |58|029| |49]020} |40/011) |32]006| |25]003| | 18] 001] | , |=43 
68 | 043 | | 59/034) |50)024| |41/014] |33]}008| | 26/004] | 19] 002 
69|049| | 60/040} |51)029| |42/017] |34]}010| |27|006} | 20] 002 0 
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Abel, Niels Henrik, 79n.6 
Abel’s formula, 79 
Absolute convergence (series): 
defined, 674 
and uniform convergence, 704 
Absolute frequency (probability): 
of an event, 1019 
cumulative, 1012 
of a value, 1012 
Absolutely integrable nonperiodic 
function, 512—513 
Absolute value (complex numbers), 
613 
Acceleration, 386-389 
Acceleration of gravity, 8 
Acceleration vector, 386 
Acceptable lots, 1094 
Acceptable quality level (AQL), 
1094 
Acceptance: 
of a hypothesis, 1078 
of products, 1092 
Acceptance number, 1092 
Acceptance sampling, 1092-1096, 
1113 
errors in, 1093-1094 
rectification, 1094-1095 
Adams, John Couch, 912n.2 
Adams-Bashforth methods, 911-914, 
947 
Adams—Moulton methods, 913-914, 
947 
Adaptive integration, 835-836, 843 
Addition: 
for arbitrary events, 1021-1022 
of complex numbers, 609, 610 
of matrices and vectors, 126, 
259-261 
of means, 1057-1058 
for mutually exclusive events, 
1021 
of power series, 687 
termwise, 173, 687 
of variances, 1058-1059 
vector, 309, 357-359 
ADI (alternating direction implicit) 
method, 928-930 
Adjacency matrix: 
of a digraph, 973 
of a graph, 972-973 


Adjacent vertices, 971, 977 
Airy, Sir George Bidell, 556n.2, 
918n.4 
Airy equation, 556 
RK method, 917-919 
RKN method, 919-920 
Airy function: 
RK method, 917-919 
RKN method, 919-920 
Algebraic equations, 798 
Algebraic multiplicity, 326, 
878 
Algorithms: 
complexity of, 978-979 
defined, 796 
numeric analysis, 796 
numeric methods as, 788 
numeric stability of, 796, 842 
ALGORITHMS: 
BISECT, A46 
DIJKSTRA, 982 
EULER, 903 
FORD-FULKERSON, 998 
GAUSS, 849 
GAUSS-SEIDEL, 860 
INTERPOL, 814 
KRUSKAL, 985 
MATCHING, 1003 
MOORE, 977 
NEWTON, 802 
PRIM, 989 
RUNGE-KUTTA, 905 
SIMPSON, 832 
Aliasing, 531 
Alternating direction implicit (ADI 
method, 928-930 
Alternating path, 1002 
Alternative hypothesis, 1078 
Ampére, André Marie, 93n.7 
Amplification, 91 
Amplitude, 90 
Amplitude spectrum, 511 
Analytic functions, 172, 201, 641 
complex analysis, 623-624 
conformal mapping, 737-742 
derivatives of, 664-668, 688-689, 
A95-A96 
integration of: 
indefinite, 647 
by use of path, 647-650 


Analytic functions (Cont.) 
Laurent series: 
analytics at infinity, 718-719 
zeros of, 717-718 
maximum modulus theorem, 
782-783 
mean value property, 781-782 
power series representation of, 
688-689 
real functions vs., 694 
Analyticity, 623 
Angle of intersection: 
conformal mapping, 738 
between two curves, 36 
Angular speed (rotation), 372 
Angular velocity (fluid flow), 
7715 
AOQ (average outgoing quality), 
1095 
AOQL (average outgoing quality 
limit), 1095 
Apparent resistance (RLC circuits), 
95 
Approximation(s): 
errors involved in, 794 
polynomial, 808 
by trigonometric polynomials, 
495-498 
Approximation theory, 495 
A priori estimates, 805 
AQL (acceptable quality level), 1094 
Arbitrary positive, 191 
Arc, of a curve, 383 
Archimedes, 391n.4 
Arc length (curves), 385-386 
Area: 
of a region, 428 
of region bounded by ellipses, 
436 
of a surface, 448-450 
Argand, Jean Robert, 611n.2 
Argand diagram, 611n.2 
Argument (complex numbers), 613 
Artificial variables, 965-968 
Assignment problems (combinatorial 
optimization), 1001-1006 
Associative law, 264 
Asymptotically equal, 189, 1027, 
1050 
Asymptotically normal, 1076 
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Asymptotically stable critical points, 
149 

Augmented matrices, 258, 272, 273, 
321, 845, 959 

Augmenting path, 1002-1003. See 
also Flow augmenting paths 

Autonomous ODEs, 11, 33 

Autonomous systems, 152, 165 

Auxiliary equation, 54. See also 
Characteristic equation 

Average flow, 458 

Average outgoing quality (AOQ), 
1095 

Average outgoing quality limit 
(AOQL), 1095 

Axioms of probability, 1020 


Back substitution (linear systems), 
274-276, 846 
Backward edges: 
cut sets, 994 
initial flow, 998 
of a path, 992 
Backward Euler formula, 909 
Backward Euler method (BEM): 
first-order ODEs, 909-910 
stiff systems, 920-921 
Backward Euler scheme, 909 
Balance law, 14 
Band matrices, 928 
Bashforth, Francis, 912n.2 
Basic feasible solution: 
normal form of linear optimization 
problems, 957 
simplex method, 959 
Basic Rule (method of undetermined 
coefficients): 
higher-order homogeneous linear 
ODEs, 115 
second-order nonhomogeneous 
linear ODEs, 81, 82 
Basic variables, 960 
Basis: 
eigenvectors, 339-340 
of solutions: 
higher-order linear ODEs, 106, 
113, 123 
homogeneous linear systems, 
290 
homogeneous ODEs, 50-52, 
75, 104, 106, 113 
second-order homogeneous 
linear ODEs, 50-52, 75, 
104 
systems of ODEs, 139 
standard, 314 
vector spaces, 286, 311, 314 
Beats (oscillation), 89 


Bellman, Richard, 981n.3 
Bellman equations, 981 
Bellman’s principle, 980-981 
Bell-shaped curve, 13, 574 
BEM, see Backward Euler method 
Benoulli, Niklaus, 31n.7 
Bernoulli, Daniel, 31n.7 
Bernoulli, Jakob, 31n.7 
Bernoulli, Johann, 31n.7 
Bernoulli distribution, 1040. See also 
Binomial distributions 
Bernoulli equation, 45 
defined, 31 
linear ODEs, 31-33 
Bernoulli’s law of large numbers, 
1051 
Bessel, Friedrich Wilhelm, 187n.6 
Bessel functions, 167, 187-191, 202 
of the first kind, 189-190 
with half-integer v, 193-194 
of order 1, 189 
of order v, 191 
orthogonality of, 506 
of the second kind: 
general solution, 196-200 
of order v, 198-200 
table, A97—A98 
of the third kind, 200 
Bessel’s equation, 167, 187-196, 
202 
Bessel functions, 167, 187-191, 
196-200 
circular membrane, 587 
general solution, 194-200 
Bessel’s inequality: 
for Fourier coefficients, 497 
orthogonal series, 508-509 
Beta function, formula for, A67 
Bezier curve, 827 
BFS algorithms, see Breadth First 
search algorithms 
Bijective mapping, 737n.1 
Binomial coefficients: 
Newton’s forward difference 
formula, 816 
probability theory, 1027-1028 
Binomial distributions, 1039-1041, 
1061 
normal approximation of, 
1049-1050 
sampling with replacement for, 
1042 
table, A99 
Binomial series, 696 
Binomial theorem, 1029 
Bipartite graphs, 1001-1006, 1008 
BISECT, ALGORITHM, A46 
Bisection method, 807-808 
Bolzano, Bernard, A94n.3 


Bolzano—Weierstrass theorem, 
A94—-A95 
Bonnet, Ossian, 180n.3 
Bonnet’s recursion, 180 
Borda, J. C., 16n.4 
Boundaries: 
ODEs, 39 
of regions, 426n.2 
sets in complex plane, 620 
Boundary conditions: 
one-dimensional heat equation, 
559 
PDEs, 541, 605 
periodic, 501 
two-dimensional wave equation, 
S77 
vibrating string, 545-547 
Boundary points, 426n.2 
Boundary value problem (BVP), 499 
conformal mapping for, 763-767, 
A96 
first, see Dirichlet problem 
mixed, see Mixed boundary value 
problem 
second, see Neumann problem 
third, see Mixed boundary value 
problem 
two-dimensional heat equation, 
564 
Bounded domains, 652 
Bounded regions, 426n.2 
Bounded sequence, A93—A95 
Boxplots, 1013 
Boyle, Robert, 19n.5 
Boyle—Mariotte’s law for idea gases, 
19 
Bragg, Sir William Henry, 938n.5 
Bragg, Sir William Lawrence, 938n.5 
Branch, of logarithm, 639 
Branch cut, of logarithm, 639 
Branch point (Riemann surfaces), 755 
Breadth First search (BFS) 
algorithms, 977 
defined, 977, 998 
Moore’s, 977-980 
BVP, see Boundary value problem 


CAD (computer-aided design), 820 
Cancellation laws, 306-307 
Canonical form, 344 
Cantor, Georg, A72n.3 
Cantor—Dedekind axiom, A72n.3, 
A95n.4 

Capacity: 

cut sets, 994 

networks, 991 
Cardano, Girolamo, 608n. 1 
Cardioid, 391, 437 
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Cartesian coordinates: 
linear element in, A75 
transformation law, A86—A87 
vector product in, A83—A84 
writing, A74 
Cartesian coordinate systems: 
complex plane, 611 
left-handed, 369, 370, A84 
right-handed, 368-369, A83—A84 
in space, 315, 356 
transformation law for vector 
components, A85—A86 
Cartesius, Renatus, 356n.1 
Cauchy, Augustin-Louis, 71n.4, 
625n.4, 683n.1 
Cauchy determinant, 113 
Cauchy—Goursat theorem, see 
Cauchy’s integral theorem 
Cauchy—Hadamard formula, 683 
Cauchy principal value, 727, 730 
Cauchy—Riemann equations, 38, 642 
complex analysis, 623-629 
proof of, A90—-A91 
Cauchy—Schwarz inequality, 363, 
871-782 
Cauchy’s convergence principle, 
674-675, A93—-A94 
Cauchy’s inequality, 666 
Cauchy’s integral formula, 660-663, 
670 
Cauchy’s integral theorem, 652-660, 
669 
existence of indefinite integral, 
656-658 
Goursat’s proof of, A91—A93 
independence of path, 655 
for multiply connected domains, 
658-659 
principle of deformation of path, 
656 
Cayley, Arthur, 748n.2 
c-charts, 1092 
Center: 
as critical point, 144, 165 
of a graph, 991 
of power series, 680 
Center control line (CL), 1088 
Center of gravity, of mass in a 
region, 429 
Central difference notation, 819 
Central limit theorem, 1076 
Central vertex, 991 
Centrifugal force, 388 
Centripetal acceleration, 387-388 
Chain rules, 392-394 
Characteristics, 555 
Characteristics, method of, 555 
Characteristic determinant, of a 
matrix, 129, 325, 326, 353, 877 


Characteristic equation: 
matrices, 129, 325, 326, 353, 877 
PDEs, 555 
second-order homogeneous linear 
ODEs, 54 
Characteristic matrix, 326 
Characteristic polynomial, 325, 353, 
877 
Characteristic values, 87, 324, 353. 
See also Eigenvalues 
Characteristic vectors, 324, 877. See 
also eigenvectors 
Chebyshev, Pafnuti, 504n.6 
Chebyshev equation, 504 
Chebyshev polynomials, 504 
Checkerboard pattern (determinants), 
294 
Chi-square (x) distribution, 
1074-1076, A104 
Chi-square (x?) test, 1096-1097, 
1113 
Choice of numeric method, for matrix 
eigenvalue problems, 879 
Cholesky, André-Louis, 855n.3 
Cholesky’s method, 855-856, 898 
Chopping, error caused by, 792 
Chromatic number, 1006 
Circle, 386 
Circle of convergence (power series), 
682 
Circulation, of flow, 467, 774 
CL (center control line), 1088 
Clairaut equation, 35 
Clamped condition (spline 
interpolation), 823 
Class intervals, 1012 
Class marks, 1012 
Closed annulus, 619 
Closed circular disk, 619 
Closed integration formulas, 833, 838 
Closed intervals, A72n.3 
Closed Newton—Cotes formulas, 833 
Closed paths, 414, 645, 975-976 
Closed regions, 426n.2 
Closed sets, 620 
Closed trails, 975-976 
Closed walks, 975-976 
CN (Crank—Nicolson) method, 
938-941 
Coefficients: 
binomial: 
Newton’s forward difference 
formula, 816 
probability theory, 1027-1028 
constant: 
higher-order homogeneous 
linear ODEs, 111-116 
second-order homogeneous 
linear ODEs, 53-60 


Coefficients: (Cont. ) 
second-order nonhomogeneous 
linear ODEs, 81 
systems of ODEs, 140-151 
correlation, 1108-1111, 1113 
Fourier, 476, 484, 538, 582-583 
of kinetic friction, 19 
of linear systems, 272, 845 
of ODEs, 47 
higher-order homogeneous 
linear ODEs, 105 
second-order homogeneous 
linear ODEs, 53-60, 73 
second-order nonhomogeneous 
linear ODEs, 81-85 
series of ODEs, 168, 174 
variable, 167, 240-241 
of power series, 680 
regression, 1105, 1107-1108 
variable: 
Frobenius method, 180-187 
Laplace transforms ODEs 
with, 240-241 
of ODEs, 167, 240-241 
power series method, 167-175 
second-order homogeneous 
linear ODEs, 73 
Coefficient matrices, 257, 273 
Hermitian or skew-Hermitian 
forms, 351 
linear systems, 845 
quadratic form, 343 
Cofactor (determinants), 294 
Collatz, Lothar, 883n.9 
Collatz inclusion theorem, 883-884 
Columns: 
determinants, 294 
matrix, 125, 257, 320 
Column “sum” norm, 861 
Column vectors, 126 
matrices, 257, 284-285, 320 
rank in terms of, 284-285 
Combinations (probability theory), 
1024, 1026-1027 
of n things taken k at a time 
without repetitions, 1026 
of n things taken k at a time with 
repetitions, 1026 
Combinatorial optimization, 970, 
975-1008 
assignment problems, 1001—1006 
flow problems in networks, 
991-997 
cut sets, 994-996 
flow augmenting paths, 
992-993 
paths, 992 
Ford—Fulkerson algorithm for 
maximum flow, 998-1001 
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Combinatorial optimization (Cont.) 
shortest path problems, 975-980 


Bellman’s principle, 980-981 

complexity of algorithms, 
978-980 

Dijkstra’s algorithm, 981-983 

Moore’s BFS algorithm, 
977-980 


shortest spanning trees: 


Greedy algorithm, 984-988 
Prim’s algorithm, 988-991 


Complex analysis (Cont.) 
Riemann surfaces, 754-756 
by trigonometric and 

hyperbolic analytic 
functions, 750-754 
half-planes, 619-620 
harmonic functions, 628-629 
Laplace’s equation, 628-629 
Laurent series, 708-719, 734 
analytic or singular at infinity, 
718-719 


Complex integration (Cont.) 
derivatives of analytic functions, 
664-668 
Laurent series, 708-719 
analytic or singular at infinity, 
718-719 
point at infinity, 718 
Riemann sphere, 718 
singularities, 715-717 
zeros of analytic functions, 
717-718 


Commutation (matrices), 271 
Complements: 

of events, 1016 

of sets in complex plane, 620 
Complementation rule, 

1020-1021 

Complete bipartite graphs, 1005 
Complete graphs, 974 
Complete matching, 1002 


Completeness (orthogonal series), 


508-509 
Complete orthonormal set, 508 
Complex analysis, 607 
analytic functions, 623-624 
Cauchy—Riemann equations, 
623-629 
circles and disks, 619 
complex functions, 620-623 


point at infinity, 718 

Riemann sphere, 718 
singularities, 715-717 

zeros of analytic functions, 717 


power series, 168, 671-707 


convergence behavior of, 
680-682 

convergence tests, 674-676, 
A93-A94 

functions given by, 685-690 

Maclaurin series, 690 

in powers of x, 168 

radius of convergence, 
682-684 

ratio test, 676-678 

root test, 678-679 

sequences, 671-673 

series, 673-674 


line integrals, 643-652, 669 


basic properties of, 645 

bounds for, 650-651 

definition of, 643-645 

existence of, 646 

indefinite integration and 
substitution of limits, 
646-647 

representation of a path, 
647-650 


power series, 671-707 


convergence behavior of, 
680-682 

convergence tests, 674-676 

functions given by, 685-690 

Maclaurin series, 690 

radius of convergence of, 
682-684 


exponential, 630-633 
general powers, 639-640 
hyperbolic, 635 
logarithm, 636-639 
trigonometric, 633-635 


complex integration, 643-670 


Cauchy’s integral formula, 
660-663, 670 
Cauchy’s integral theorem, 
652-660, 669 
derivatives of analytic 
functions, 664—668 
Laurent series, 708-719 
line integrals, 643-652, 669 
power series, 671-707 
residue integration, 719-733 


complex numbers, 608-619 


addition of, 609, 610 
conjugate, 612 

defined, 608 

division of, 610 
multiplication of, 609, 610 
polar form of, 613-618 
subtraction of, 610 


complex plane, 611 
conformal mapping, 736-757 


geometry of analytic functions, 
737-7142 

linear fractional 
transformations, 
742-750 


Taylor series, 690-697 
uniform convergence, 
698-705 
residue integration, 719-733 

formulas for residues, 721—722 
of real integrals, 725-733 
several singularities inside 
contour, 723-725 
Taylor series, 690-697, 707 
Complex conjugate numbers, 612 
Complex conjugate roots, 72-73 
Complex Fourier integral, 523 
Complex functions, 620-623 
exponential, 630-633 
general powers, 639-640 
hyperbolic, 635 
logarithm, 636-639 
trigonometric, 633-635 
Complex heat potential, 767 
Complex integration, 643-670 
Cauchy’s integral formula, 
660-663, 670 
Cauchy’s integral theorem, 
652-660, 669 
existence of indefinite integral, 
656-658 
independence of path, 655 
for multiply connected 
domains, 658-659 
principle of deformation of 
path, 656 


ratio test, 676-678 
root test, 678-679 
sequences, 671-673 
series, 673-674 
Taylor series, 690-697 
uniform convergence, 
698-705 
residue integration, 719-733 
formulas for residues, 721-722 
of real integrals, 725-733 
several singularities inside 
contour, 723-725 
Complexity, of algorithms, 978-979 
Complex line integrals, see Line 
integrals 
Complex matrices and forms, 
346-352 
Complex numbers, 608-619, 641 
addition of, 609, 610 
conjugate, 612 
defined, 608 
division of, 610 
multiplication of, 609, 610 
polar form of, 613-618 
subtraction of, 610 
Complex plane, 611 
extended, 718, 744-745 
sets in, 620 
Complex potential, 786 
electrostatic fields, 760-761 
of fluid flow, 771, 773-774 
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Complex roots: 
higher-order homogeneous linear 
ODEs: 
multiple, 115 
simple, 113-114 
second-order homogeneous linear 
ODEs, 57-59 
Complex trigonometric polynomials, 
529 
Complex variables, 620-621 
Complex vector space, 309, 310, 
349 
Components (vectors), 126, 356, 365 
Composition, of linear 
transformations, 316-317 
Computer-aided design (CAD), 820 
Condition: 
of incompressibility, 405 
spline interpolation, 823 
Conditionally convergent series, 675 
Conditional probability, 1022-1023, 
1061 
Condition number, 868-870, 899 
Confidence intervals, 1063, 
1068-1077, 1113 
interval estimates, 1065 
for mean of normal distribution: 
with known variance, 
1069-1071 
with unknown variance, 
1071-1073 
for parameters of distributions 
other than normal, 1076 
in regression analysis, 1107-1108 
for variance of a normal 
distribution, 1073-1076 
Confidence level, 1068 
Conformality, 738 
Conformal mapping, 736-757 
boundary value problems, 
763-767, A96 
defined, 738 
geometry of analytic functions, 
737-742 
linear fractional transformations, 
742-750 
extended complex plane, 
744-745 
mapping standard domains, 
747-7150 
Riemann surfaces, 754-756 
by trigonometric and hyperbolic 
analytic functions, 
750-754 
Connected graphs, 977, 981, 984 
Connected set, in complex plane, 
620 
Conservative physical systems, 422 
Conservative vector fields, 400, 408 
Consistent linear systems, 277 


Constant coefficients: 
higher-order homogeneous linear 
ODEs, 111-116 
distinct real roots, 112-113 
multiple real roots, 114-115 
simple complex roots, 113-114 
second-order homogeneous linear 
ODEs, 53-60 
complex roots, 57-59 
real double root, 55-56 
two distinct real roots, 54-55 
second-order nonhomogeneous 
linear ODEs, 81 
systems of ODEs, 140-151 
critical points, 142-146, 
148-151 
graphing solutions in phase 
plane, 141-142 
Constant of gravity, at the Earth’s 
surface, 63 
Constant of integration, 18 
Constant revenue, lines of, 954 
Constrained (linear) optimization, 
951, 954-958, 969 
normal form of problems, 955-957 
simplex method, 958-968 
degenerate feasible solution, 
962-965 
difficulties in starting, 965-968 
Constraints, 951 
Consumers, 1092 
Consumer’s risk, 1094 
Consumption matrix, 334 
Continuity equation (compressible 
fluid flow), 405 
Continuous complex functions, 621 
Continuous distributions, 1029, 
1032-1034 
marginal distribution of, 1055 
two-dimensional, 1053 
Continuous random variables, 1029, 
1032-1034, 1061 
Continuous vector functions, 378-379 
Contour integral, 653 
Contour lines, 21, 36 
Control charts, 1088 
for mean, 1088-1089 
for range, 1090-1091 
for standard deviation, 1090 
for variance, 1089-1090 
Controlled variables, in regression 
analysis, 1103 
Control limits, 1088, 1089 
Control variables, 951 
Convergence: 
absolute: 
defined, 674 
and uniform convergence, 704 
of approximate and exact 
solutions, 936 


Convergence: (Cont.) 
circle of, 682 
defined, 861 
Gauss-Seidel iteration, 861-862 
mean square (orthogonal series), 
507-508 
in the norm, 507 
power series, 680-682 
convergence tests, 674-676, 
A93-A94 
radius of convergence of, 
682-684, 706 
uniform convergence, 698-705 
radius of, 172 
defined, 172 
power series, 682-684, 706 
sequence of vectors, 378 
speed of (numeric analysis), 
804-805 
superlinear, 806 
uniform: 
and absolute convergence, 704 
power series, 698-705 
Convergence interval, 171, 683 
Convergence tests, 674-676 
power series, 674-676, A93-A94 
uniform convergence, 698-705 
Convergent iteration processes, 
800 
Convergent sequence of functions, 
507-508, 672 
Convergent series, 171, 673 
Convolution: 
defined, 232 
Fourier transforms, 527-528 
Laplace transforms, 232—237 
Convolution theorem, 232—233 
Coriolis, Gustave Gaspard, 389n.3 
Coriolis acceleration, 388-389 
Corrector (improved Euler method), 
903 
Correlation analysis, 1063, 
1108-1111, 1113 
defined, 1103 
test for correlation coefficient, 
1110-1111 
Correlation coefficient, 1108-1111, 
1113 
Cosecant, formula for, A65 
Cosine function: 
conformal mapping by, 752 
formula for, A63—A65 
Cosine integral: 
formula for, A69 
table, A98 
Cosine series, 781 
Cotangent, formula for, A65 
Coulomb, Charles Augustin de, 
19n.6, 93n.7, 401n.6 
Coulomb’s law, 19, 401 
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Covariance: 
in correlation analysis, 1109 
defined, 1058 
Cramer, Gabriel, 31n.7, 298n.2 
Cramer’s rule, 292, 298-300, 321 
for three equations, 293 
for two equations, 292 
Cramer’s Theorem, 298 
Crank, John, 938n.5 
Crank—Nicolson (CN) method, 
938-941 
Critical damping, 65, 66 
Critical points, 33, 165 
asymptotically stable, 149 
and conformal mapping, 738, 757 
constant-coefficient systems of 
ODEs, 142-146 
center, 144 
criteria for, 148-151 
degenerate node, 145-146 
improper node, 142 
proper node, 143 
saddle point, 143 
spiral point, 144-145 
stability of, 149-151 
isolated, 152 
nonlinear systems, 152 
stable, 140, 149 
stable and attractive, 140, 149 
unstable, 140, 149 
Critical region, 1079 
Cross product, 368, 410. See also 
Vector product 
Crout, Prescott Durand, 853n.2 
Crout’s method, 853, 898 
Cubic spline, 821 
Cumulative absolute frequencies (of 
values), 1012 
Cumulative distribution functions, 
1029 
Cumulative relative frequencies (of 
values), 1012 
Curl, A76 
invariance of, A85—A88 
of vector fields, 406-409, 412 
Curvature, of a curve, 389-390 
Curves: 
arc of, 383 
bell-shaped, 13, 574 
Bezier, 827 
deflection, 120 
elastic, 120 
equipotential, 36, 759, 761 
one-parameter family of, 36-37 
operating characteristic, 1081, 
1092, 1095 
oriented, 644 
orthogonal coordinate, A74 
parameter, 442 
plane, 383 


Curves: (Cont.) 
regression, 1103 
simple, 383 
simple closed, 646 
smooth, 414, 644 
solution, 4-6 
twisted, 383 
vector differential calculus, 
381-392, 411 
arc length of, 385-386 
length of, 385 
in mechanics, 386-389 
tangents to, 384-385 
and torsion, 389-390 
Curve fitting, 872-876 
method of least squares, 872-874 
by polynomials of degree m, 
874-875 
Curvilinear coordinates, 354, 412, A74 
Cut sets, 994-996, 1008 
Cycle (paths), 976, 984 
Cylindrical coordinates, 593-594, 
A74-A76 


D’ Alembert, Jean le Rond, 554n.1 
D’Alembert’s solution, 553-556 
Damped oscillations, 67 
Damping constant, 65 
Dantzig, George Bernard, 959 
Data processing: 
frequency distributions, 
1011-1012 
and randomness, 1064 
Data representation: 
frequency distributions, 
1011-1015 
Empirical Rule, 1014 
graphic, 1012 
mean, 1013-1014 
standard deviation, 1014 
variation, 1014 
and randomness, 1064 
Decisions: 
false, risks of making, 1080 
statistics for, 1077-1078 
Dedekind, Richard, A72n.3 
Defect (eigenvalue), 328 
Defectives, 1092 
Definite integrals, complex, see Line 
integrals 
Deflection curve, 120 
Deformation of path, principle of, 
656 
Degenerate feasible solution (simplex 
method), 962-965 
Degenerate node, 145-146 
Degrees of freedom (d.f.), number of, 
1071, 1074 
Degree of incidence, 971 


Degree of precision (DP), 833 
Deleted neighborhood, 720 
Demand vector, 334 
De Moivre, Abraham, 616n.3 
De Moivre—Laplace limit theorem, 
1050 
De Moivre’s formula, 616 
De Morgan’s laws, 1018 
Density, 1061 
continuous two-dimensional 
distributions, 1053 
of a distribution, 1033 
Dependent random variables, 1055, 
1056 
Dependent variables, 393, 1055, 1056 
Depth First Search (DFS) algorithms, 
977 
Derivatives: 
of analytic functions, 664-668, 
688-689, A95—A96 
of complex functions, 622, 641 
Laplace transforms of, 211-212 
of matrices or vectors, 127 
of vector functions, 379-380 
Derived series, 687 
Descartes, René, 356n.1, 391n.4 
Determinants, 293-301, 321 
Cauchy, 113 
Cramer’s rule, 298-300 
defined, A81 
general properties of, 295-298 
of a matrix, 128 
of matrix products, 307-308 
of order n, 293 
proof of, A81—A83 
second-order, 291—292 
second-order homogeneous linear 
ODEs, 76 
third-order, 292—293 
Vandermonde, 113 
Wronski: 
second-order homogeneous 
linear ODEs, 75-78 
systems of ODEs, 139 
Developed, in a power series, 683 
D.f. (degrees of freedom), number of, 
1071, 1074 
DFS (Depth First Search) algorithms, 
977 
DFTs (discrete Fourier transforms), 
528-531 
Diagonalization of matrices, 341-342 
Diagonally dominant matrices, 881 
Diagonal matrices, 268 
inverse of, 305-306 
scalar, 268 
Diameter (graphs), 991 
Difference: 
complex numbers, 610 
scalar multiplication, 260 
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Difference equations (elliptic PDEs), 
923-925 
Difference quotients, 923 
Difference table, 814 
Differentiable complex functions, 
622-623 
Differentiable vector functions, 379 
Differential (total differential), 
20, 45 
Differential equations: 
applications of, 3 
defined, 2 
Differential form, 422 
exact, 21, 470 
first fundamental form, of S, 451 
floating-point, of numbers, 
791-792 
path independence and exactness 
of, 422, 470 
Differential geometry, 381 
Differential operators: 
second-order, 60 
for second-order homogeneous 
linear ODEs, 60-62 
Differentiation: 
of Laplace transforms, 238-240 
matrices or vectors, 127 
numeric, 838-839 
of power series, 687-688, 703 
termwise, 173, 687-688, 703 
Diffusion equation, 459-460, 558. 
See also Heat equation 
Digraphs (directed graphs), 971-972, 
1007 
computer representation of, 
972-974 
defined, 972 
incidence matrix of, 975 
subgraphs, 972 
Dijkstra, Edsger Wybe, 981n.4 
Dijkstra’s algorithm, 981-983, 
1008 
DIJKSTRA, ALGORITHM, 982 
Dimension of vector spaces, 286, 
311, 359 
Diocles, 391n.4 
Dirac, Paul, 226n.2 
Dirac delta function, 226-228, 237 
Directed graphs, see Digraphs 
(directed graphs) 
Directed path, 1000 
Directional derivatives (scalar 
functions), 396-397, 411 
Direction field (slope field), 9-10, 44 
Direct methods (linear system 
solutions), 858, 898. See also 
iteration 
Dirichlet, Peter Gustav LeJeune, 
462n.8 
Dirichlet boundary condition, 564 


Dirichlet problem, 605, 923 
ADI method, 929 
heat equation, 564-566 
Laplace equation, 593-596, 
925-928, 934-935 
Poisson equation, 925-928 
two-dimensional heat equation, 
564-565 
uniqueness theorem for, 462, 784 
Dirichlet’s discontinuous factor, 514 
Discharge (flow modeling), 776 
Discrete distributions, 1029-1032 
marginal distributions of, 
1053-1054 
two-dimensional, 1052-1053 
Discrete Fourier transforms (DFTs), 
528-531 
Discrete random variables, 1029, 
1030-1032, 1061 
defined, 1030 
marginal distributions of, 1054 
Discrete spectrum, 525 
Disjoint events, 1016 
Disks: 
circular, open and closed, 619 
mapping, 748-750 
Poisson’s integral formula, 779-780 
Dissipative physical systems, 422 
Distance: 
graphs, 991 
vector norms, 866 
Distinct real roots: 
higher-order homogeneous linear 
ODEs, 112-113 
second-order homogeneous linear 
ODEs, 54-55 
Distinct roots (Frobenius method), 
182 
Distributions, 226n.2. See also 
Frequency distributions; 
Probability distributions 
Distribution-free tests, 1100 
Distribution function, 1029-1032 
cumulative, 1029 
normal distributions, 1046-1047 
of random variables, 1056, A109 
sample, 1096 
two-dimensional probability 
distributions, 1051-1052 
Distributive laws, 264 
Distributivity, 363 
Divergence, A75 
fluid flow, 775 
of vector fields, 402-406 
of vector functions, 411, 453 
Divergence theorem of Gauss, 405, 
470 
applications, 458-463 
vector integral calculus, 453-457 
Divergent sequence, 672 
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Divergent series, 171, 673 
Division, of complex numbers, 610, 
615-616 
Domain(s), 393 
bounded, 652 
doubly connected, 658, 659 
of f, 620 
holes of, 653 
mapping, 737, 747-750 
multiply connected: 
Cauchy’s integral formula, 
662-663 
Cauchy’s integral theorem, 
658-659 
p-fold connected, 652-653 
sets in complex plane, 620 
simply connected, 423, 646, 652, 
653 
triply connected, 653, 658, 659 
Dominant eigenvalue, 883 
Doolittle, Myrick H., 853n.1 
Doolittle’s method, 853-855, 898 
Dot product, 312, 410. See also Inner 
product 
Double Fourier series: 
defined, 582 
rectangular membrane, 577-585 
Double integrals (vector integral 
calculus), 426-432, 470 
applications of, 428-429 
change of variables in, 429-431 
evaluation of, by two successive 
integrations, 427-428 
Double precision, floating-point 
standard for, 792 
Double root (Frobenius method), 183 
Double subscript notation, 125 
Doubly connected domains, 658, 659 
DP (degree of precision), 833 
Driving force, see Input (driving 
force) 
Duffing equation, 160 
Duhamel, Jean-Marie Constant, 
603n.4 
Duhamel’s formula, 603 


Eccentricity, of vertices, 991 
Edges: 
backward: 
cut sets, 994 
initial flow, 998 
of a path, 992 
forward: 
cut sets, 994 
initial flow, 998 
of a path, 992 
graphs, 971, 1007 
incident, 971 
Edge chromatic number, 1006 
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Edge condition, 991 
Edge incidence list (graphs), 973 
Efficient algorithms, 979 
Eigenbases, 339-341 
Eigenfunctions, 605 
circular membrane, 588 
one-dimensional heat equation, 
560 
Sturm—Liouville Problems, 
499-500 
two-dimensional heat equation, 
565 
two-dimensional wave equation, 
578, 580 
vibrating string, 547 
Eigenfunction expansion, 504 
Eigenspaces, 326, 878 
Eigenvalues, 129-130, 166, 353, 605, 
877, 899. See also Matrix 
eigenvalue problems 
circular membrane, 588 
complex matrices, 347-351 
and critical points, 149 
defined, 324 
determining, 323-329 
dominant, 883 
finding, 324-328 
one-dimensional heat equation, 
560 
Sturm-—Liouville Problems, 
499-500, A89 
two-dimensional wave equation, 
580 
vibrating string, 547 
Eigenvalues of A, 322 
Eigenvalue problem, 140 
Eigenvectors, 129-130, 166, 353, 
877, 899 
basis of, 339-340 
convergent sequence of, 886 
defined, 324 
determining, 323-329 
finding, 324-328 
Eigenvectors of A, 322 
EISPACK, 789 
Elastic curve, 120 
Electric circuits: 
analogy of electrical and 
mechanical quantities, 
97-98 
second-order nonhomogeneous 
linear ODEs, 93-99 
Electrostatic fields (potential theory), 
759-763 
complex potential, 760-761 
superposition, 761-762 
Electrostatic potential, 759 
Electrostatics (Laplace’s equation), 
593 
Elementary matrix, 281 


Elementary row operations (linear 
systems), 277 
Ellipses, area of region bounded by, 
436 
Elliptic PDEs: 
defined, 923 
numeric analysis, 922-936 
ADI method, 928-930 
difference equations, 923-925 
Dirichlet problem, 925-928 
irregular boundary, 933-935 
mixed boundary value 
problems, 931-933 
Neumann problem, 931 
Empirical Rule, 1014 
Energies, 157 
Entire function, 630, 642, 707, 718 
Entries: 
determinants, 294 
matrix, 125, 257 
Equal complex numbers, 609 
Equality: 
of matrices, 126, 259 
of vectors, 355 
Equally likely events, 1018 
Equal spacing (interpolation): 
Newton’s backward difference 
formula, 818-819 
Newton’s forward difference 
formula, 815-818 
Equilibrium harvest, 36 
Equilibrium solutions (equilibrium 
points), 33-34 
Equipotential curves, 36, 759, 761 
Equipotential lines, 38 
electrostatic fields, 759, 761 
fluid flow, 771 
Equipotential surfaces, 759 
Equivalent vector norms, 871 
Error(s): 
in acceptance sampling, 
1093-1094 
of approximations, 495 
in numeric analysis, 842 
basic error principle, 796 
error propagation, 795 
errors of numeric results, 
794-795 
roundoff, 792 
in statistical tests, 1080-1081 
and step size control, 906-907 
trapezoidal rule, 830 
vector norms, 866 
Error bounds, 795 
Error estimate, 908 
Error function, 828, A67—A68, A98 
Essential singularity, 715-716 
Estimation of parameters, 1063 
EULER, ALGORITHM, 903 
Euler, Leonhard, 31n.7, 71n.4 


Euler—Cauchy equations, 71-74, 
104 
higher-order nonhomogeneous 
linear ODEs, 119-120 
Laplace’s equation, 595 
third-order, IVP for, 108 
Euler—Cauchy method, 901 
Euler constant, 198 
Euler formulas, 58 
complex Fourier integral, 523 
derivation of, 479-480 
exponential function, 631 
Fourier coefficients given by, 476, 
484 
generalized, 582 
Taylor series, 694 
trigonometric function, 634 
Euler graph, 980 
Euler’s method: 
defined, 10 
error of, 901-902, 906, 908 
first-order ODEs, 10-11, 901-902 
backward method, 909-910 
improved method, 902-904 
higher order ODEs, 916-917 
Euler trail, 980 
Even functions, 486-488 
Even periodic extension, 488-490 
Events (probability theory), 
1016-1017, 1060 
addition rule for, 1021—1022 
arbitrary, 1021-1022 
complements of, 1016 
defined, 1015 
disjoint, 1016 
equally likely, 1018 
independent, 1022-1023 
intersection, 1016, 1017 
mutually exclusive, 1016, 1021 
simple, 1015 
union, 1016-1017 
Exact differential equation, 21 
Exact differential form, 422, 470 
Exact ODEs, 20-27, 45 
defined, 21 
integrating factors, 23-26 
Existence, problem of, 39 
Existence theorems: 
cubic splines, 822 
first-order ODEs, 39-42 
homogeneous linear ODEs: 
higher-order, 108 
second-order, 74 
of the inverse, 301-302 
Laplace transforms, 209-210 
linear systems, 138 
power series solutions, 172 
systems of ODEs, 137 
Expectation, 1035, 1037-1038, 
1057 
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Experiments: 
defined, 1015, 1060 
in probability theory, 1015-1016 
random, 1011, 1015-1016, 1060 
Experimental error, 794 
Explicit formulas, 913 
Explicit method: 
heat equation, 937, 940-941 
wave equation, 943 
Explicit solution, 21 
Exponential decay, 5, 7 
Exponential function, 630-633, 642 
formula for, A63 
Taylor series, 694 
Exponential growth, 5 
Exponential integral, formula for, A69 
Exposed vertices, 1001, 1003 
Extended complex plane: 
conformal mapping, 744-745 
defined, 718 
Extended method (separable ODEs), 
17-18 
Extended problems, 966 
Extrapolation, 808 
Extrema (unconstrained 
optimization), 951 


Factorial function, 1027, A66, A98. 
See also Gamma functions 
Failing to reject a hypothesis, 1081 
Fair die, 1018, 1019 
False decisions, risks of making, 
1080 
False position, method of, 807-808 
Family of curves, one-parameter, 
36-37 
Family of solutions, 5 
Faraday, Michael, 93n.7 
Fast Fourier transforms (FFTs), 
531-532 
F-distribution, 1086, Al05—A108 
Feasibility region, 954 
Feasible solutions, 954-955 
basic, 957, 959 
degenerate, 962-965 
normal form of linear optimization 
problems, 957 
Fehlberg, E., 907 
Fehlberg’s fifth-order RK method, 
907-908 
Fehlberg’s fourth-order RK method, 
907-908 
FFTs (fast Fourier transforms), 
531-532 
Fibonacci (Leonardo of Pisa), 690n.2 
Fibonacci numbers, 690 
Fibonacci’s rabbit problem, 690 
Finite complex plane, 718. See also 
Complex plane 


Finite jumps, 209 
First boundary value problem, see 
Dirichlet problem 
First fundamental form, of S, 451 
First-order method, Euler method as, 
902 
First-order ODEs, 2-45, 44 
defined, 4 
direction fields, 9-10 
Euler’s method, 10-11 
exact, 20-27, 45 
defined, 21 
integrating factors, 23-26 
explicit form, 4 
geometric meanings of, 9-12 
implicit form, 4 
initial value problem, 38—43 
linear, 27-36 
Bernoulli equation, 31-33 
homogeneous, 28 
nonhomogeneous, 28-29 
population dynamics, 33-34 
modeling, 2-8 
numeric analysis, 901-915 
Adams-—Bashforth methods, 
911-914 
Adams—Moulton methods, 
913-914 
backward Euler method, 
909-910 
Euler’s method, 901—902 
improved Euler’s method, 
902-904 
multistep methods, 911-915 
Runge—Kutta—Fehlberg 
method, 906-908 
Runge-Kutta methods, 
904-906 
orthogonal trajectories, 36-38 
separable, 12-20, 44 
extended method, 17-18 
modeling, 13-17 
systems of, 165 
transformation of systems to, 
157-159 
First (first order) partial derivatives, 
A71 
First shifting theorem (s-shifting), 
208-209 
First transmission line equation, 599 
Fisher, Sir Ronald Aylmer, 1086 
Fixed points: 
defined, 799 
of a mapping, 745 
Fixed-point iteration (numeric 
analysis), 798-801, 842 
Fixed-point systems, numbers in, 791 
Floating, 793 
Floating-point form of numbers, 
791-792 


Flow augmenting paths, 992-993, 
998, 1008 
Flow problems in networks 
(combinatorial optimization), 
991-997 
cut sets, 994-996 
flow augmenting paths, 992-993 
paths, 992 
Fluid flow: 
Laplace’s equation, 593 
potential theory, 771-777 
Fluid state, 404 
Flux (motion of a fluid), 404 
Flux integral, 444, 450 
Forced motions, 68, 86 
Forced oscillations: 
Fourier analysis, 492-495 
second-order nonhomogeneous 
linear ODEs, 85-92 
damped, 89-90 
resonance, 88-91 
undamped, 87-89 
Forcing function, 86 
Ford, Lester Randolph, Jr., 998n.7 
FORD-FULKERSON, 
ALGORITHM, 998 
Ford—Fulkerson algorithm for 
maximum flow, 998-1001, 
1008 
Forest (graph), 987 
Form(s): 
canonical, 344 
complex, 351 
differential, 422 
exact, 21, 470 
path independence and 
exactness of, 422 
Hesse’s normal, 366 
Lagrange’s, 812 
normal (linear optimization 
problems), 955-957, 959, 
969 
Pfaffian, 422 
polar, of complex numbers, 
613-618, 631 
quadratic, 343-344, 346 
reduced echelon, 279 
row echelon, 279-280 
skew-Hermitian and Hermitian, 
351 
standard: 
first-order ODEs, 27 
higher-order homogeneous 
linear ODEs, 105 
higher-order linear ODEs, 123 
power series method, 172 
second-order linear ODEs, 46, 
103 
triangular (Gauss elimination), 
846 
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Forward edge: 
cut sets, 994 
initial flow, 998 
of a path, 992 
Four-color theorem, 1006 
Fourier, Jean-Baptiste Joseph, 473n.1 
Fourier analysis, 473-539 
approximation by trigonometric 
polynomials, 495-498 
forced oscillations, 492-495 
Fourier integral, 510-517 
applications, 513-515 
complex form of, 522-523 
sine and cosine, 515-516 
Fourier series, 474-483 
convergence and sum of, 
480-48 1 
derivation of Euler formulas, 
479-480 
even and odd functions, 
486-488 
half-range expansions, 488-490 
from period 277 to 2L, 
483-486 
Fourier transforms, 522-536 
complex form of Fourier 
integral, 522-523 
convolution, 527-528 
cosine, 518-522, 534 
discrete, 528-531 
fast, 531-532 
and its inverse, 523-524 
linearity, 526-527 
sine, 518-522, 535 
spectrum representation, 525 
orthogonal series (generalized 
Fourier series), 504-510 
completeness, 508-509 
mean square convergence, 
507-508 
Sturm-—Liouville Problems, 
498-504 
eigenvalues, eigenfunctions, 
499-500 
orthogonal functions, 500-503 
Fourier—Bessel series, 506—507, 589 
Fourier coefficients, 476, 484, 538, 
582-583 
Fourier constants, 504—505 
Fourier cosine integral, 515-516 
Fourier cosine series, 484, 486, 538 
Fourier cosine transforms, 518-522, 
534 
Fourier cosine transform method, 518 
Fourier integrals, 510-517, 539 
applications, 513-515 
complex form of, 522-523 
heat equation, 568-571 
residue integration, 729-730 
sine and cosine, 515-516 


Fourier—Legendre series, 505-506, 
596-598 
Fourier matrix, 530 
Fourier series, 473-483, 538 
convergence and sum or, 480-481 
derivation of Euler formulas, 
479-480 
double, 577-585 
even and odd functions, 486-488 
half-range expansions, 488-490 
heat equation, 558-563 
from period 277 to 2L, 483-486 
Fourier sine integral, 515-516 
Fourier sine series, 477, 486, 538 
one-dimensional heat equation, 
561 
vibrating string, 548 
Fourier sine transforms, 518—522, 
535 
Fourier transforms, 522-536, 539 
complex form of Fourier integral, 
522-523 
convolution, 527-528 
cosine, 518-522, 534, 539 
defined, 522, 523 
discrete, 528-531 
fast, 531-532 
heat equation, 571-574 
and its inverse, 523-524 
linearity of, 526-527 
sine, 518-522, 535, 539 
spectrum representation, 525 
Fourier transform method, 524 
Four-point formulas, 841 
Fraction defective chars, 1091-1092 
Francis, J. G. F., 892 
Fredholm, Erik Ivar, 198n.7, 263n.3 
Free condition (spline interpolation), 
823 
Free oscillations of mass—spring 
system (second-order ODEs), 
62-70 
critical damping, 65, 66 
damped system, 64—65 
overdamping, 65-66 
undamped system, 63-64 
underdamping, 65, 67 
Frenet, Jean-Frédéric, 392 
Frenet formulas, 392 
Frequency (in statistics): 
absolute, 1012, 1019 
cumulative absolute, 1012 
cumulative relative, 1012 
relative class, 1012 
Frequency (of vibrating string), 547 
Frequency distributions, mean and 
variance of: 
expectation, 1037-1038 
moments, 1038 
transformation of, 1036-1037 


Fresnel, Augustin, 697n.4, A68n.1 
Fresnel integrals, 697, A68 
Frobenius, Georg, 180n.4 
Frobenius method, 167, 180-187, 
201 
indicial equation, 181—183 
proof of, A77—A81 
typical applications, 183-185 
Frobenius norm, 861 
Fulkerson, Delbert Ray, 998n.7 
Function, of complex variable, 
620-621 
Function spaces, 313 
Fundamental matrix, 139 
Fundamental period, 475 
Fundamental region (exponential 
function), 632 
Fundamental system, 50, 104. See 
also Basis, of solutions 
Fundamental Theorem: 
higher-order homogeneous linear 
ODEs, 106 
for linear systems, 288 
PDEs, 541-542 
second-order homogeneous linear 
ODEs, 48 


Galilei, Galileo, 16n.4 
Gamma functions, 190-191, 208 
formula for, A66—A67 
incomplete, A67 
table, A98 
GAMS (Guide to Available 
Mathematical Software), 789 
GAUSS, ALGORITHM, 849 
Gauss, Carl Friedrich, 186n.5, 
608n.1, 1103 
Gauss distribution, 1045. See also 
Normal distributions 
Gauss “Double Ring,” 451 
Gauss elimination, 320, 849 
linear systems, 274-280, 
844-852, 898 
back substitution, 274-276, 
846 
elementary row operations, 
277 
if infinitely many solutions 
exist, 278 
if no solution exists, 278-279 
operation count, 850-851 
row echelon form, 279-280 
operation count, 850-851 
Gauss integration formulas, 807, 
836-838, 843 
Gauss—Jordan elimination, 302—304, 
856-857 
GAUSS-SEIDEL, ALGORITHM, 
860 
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Gauss-Seidel iteration, 858-863, 
898 
Gauss’s hypergeometric ODE, 186, 
202 
Geiger, H., 1044, 1100 
Generalized Euler formula, 582 
Generalized Fourier series, see 
Orthogonal series 
Generalized solution (vibrating 
string), 550 
Generalized triangle inequality, 615 
General powers, 639-640, 642 
General solution: 
Bessel’s equation, 194-200 
first-order ODEs, 6, 44 
higher-order linear ODEs, 106, 
110-111, 123 
nonhomogeneous linear systems, 
160 
second-order linear ODEs: 
homogeneous, 49-51, 77-78, 
104 
nonhomogeneous, 80-81 
systems of ODEs, 131-132, 139 
Generating functions, 179, 241 
Geometric interpretation: 
partial derivatives, A70 
scalar triple product, 373, 374 
Geometric multiplicity, 326, 878 
Geometric series, 168, 675 
Taylor series, 694 
uniformly convergent, 698 
Gerschgorin, Semyon Aranovich, 
879n.6 
Gerschgorin’s theorem, 879-881, 899 
Gibbs phenomenon, 515 
Global error, 902 
Golden Rule, 15, 24 
Gompertz model, 19 
Goodness of fit, 1096-1100 
Gosset, William Sealy, 1086n.4 
Goursat, Edouard, 654n.1 
Goursat’s proof, 654 
Gradient, A75 
fluid flow, 771 
of a scalar field, 395-402 
directional derivatives, 
396-397 
maximum increase, 398 
as surface normal vector, 
398-399 
vector fields that are, 400-401 
of a scalar function, 396, 411 
unconstrained optimization, 952 
Gradient method, 952. See also 
Method of steepest descent 
Graphs, 970-971, 1007 
bipartite, 1001-1006, 1008 
center of, 991 
complete, 974 


Graphs (Cont. ) 
complete bipartite, 1005 
computer representation of, 
972-974 
connected, 977, 981, 984 
diameter of, 991 
digraphs (directed graphs), 
971-974, 1007 
computer representation of, 
972-974 
defined, 972 
incidence matrix of, 975 
subgraphs, 972 
Euler, 980 
forest, 987 
incidence matrix of, 975 
planar, 1005 
radius of, 991 
sparse, 974 
subgraphs, 972 
trees, 984 
vertices, 971, 977, 1007 
adjacent, 971, 977 
central, 991 
coloring, 1005-1006 
double labeling of, 986 
eccentricity of, 991 
exposed, 1001, 1003 
four-color theorem, 1006 
scanning, 998 
weighted, 976 
Graphic data representation, 1012 
Gravitation (Laplace’s equation), 
593 
Gravity, acceleration of, 8 
Gravity constant, at the Earth’s 
surface, 63 
Greedy algorithm, 984-988 
Green, George, 433n.4 
Green’s first formula, 461, 470 
Green’s second formula, 461, 470 
Green’s theorem: 
first and second forms of, 
461 
in the plane, 433-438, 470 
Gregory, James, 816n.2 
Gregory—Newton’s (Newton’s) 
backward difference 
interpolation formula, 
818-819 
Gregory—Newton’s (Newton’s) 
forward difference 
interpolation formula, 
815-818 
Growth restriction, 209 
Guidepoints, 827 
Guide to Available Mathematical 
Software (GAMS), 789 
Guldin, Habakuk, 452n.7 
Guldin’s theorem, 452n.7 
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Hadamard, Jacques, 683n.1 
Half-planes: 
complex analysis, 619-620 
mapping, 747-749 
Half-range expansions (Fourier 
series), 488-490, 538 
Hamilton, William Rowan, 976n.1 
Hamiltonian cycle, 976 
Hankel, Hermann, 200n.8 
Hankel functions, 200 
Harmonic conjugate function 
(Laplace’s equation), 629 
Harmonic functions, 460, 462, 758 
complex analysis, 628-629 
under conformal mapping, 763 
defined, 758 
Laplace’s equation, 593, 628-629 
maximum modulus theorem, 
783-784 
potential theory, 781-784, 786 
Harmonic oscillation, 63-64 
Heat equation, 459-460, 557-558 
Dirichlet problem, 564—566 
Laplace’s equation, 564 
numeric analysis, 936-941, 948 
Crank—Nicolson method, 
938-941 
explicit method, 937, 940-941 
one-dimensional, 559 
solution: 
by Fourier integrals, 568-571 
by Fourier series, 558-563 
by Fourier transforms, 
571-574 
steady two-dimensional heat 
problems, 546-566 
two-dimensional, 564—566 
unifying power of methods, 566 
Heat flow: 
Laplace’s equation, 593 
potential theory, 767-770 
Heat flow lines, 767 
Heaviside, Oliver, 204n.1 
Heaviside calculus, 204n.1 
Heaviside expansions, 228 
Heaviside function, 217-219 
Helix, 386 
Henry, Joseph, 93n.7 
Hermite, Charles, 510n.8 
Hermite interpolation, 826 
Hermitian form, 351 
Hermitian matrices, 347, 348, 350, 353 
Hertz, Heinrich, 63n.3 
Hesse, Ludwig Otto, 366n.2 
Hesse’s normal form, 366 
Heun, Karl, 905n.1 
Heun’s method, 903. See also 
Improved Euler’s method 
Higher functions, 167. See also 
Special functions 
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Higher-order linear ODEs, 105-123 
homogeneous, 105-116, 123 
nonhomogeneous, 116-123 
systems of, see Systems 

of ODEs 
Higher order ODEs (numeric 
analysis), 915-922 
Euler method, 916-917 
Runge-Kutta methods, 917-919 
Runge—Kutta—Nystr6m methods, 
919-921 

Higher transcendental functions, 920 

High-frequency line equations, 600 

Hilbert, David, 198n.7, 312n.4 

Hilbert spaces, 363 

Histograms, 1012 

Holes, of domains, 653 

Homogeneous first-order linear 

ODEs, 28 
Homogeneous higher-order linear 
ODEs, 105-111 
Homogeneous linear systems, 138, 
165, 272, 290-291, 845 
constant-coefficient systems, 
140-151 
matrices and vectors, 124-130, 321 
trivial solution, 290 
Homogeneous PDEs, 541 
Homogeneous second-order linear 
ODEs, 46-48 
basis, 50-52 
with constant coefficients, 53-60 
complex roots, 57-59 
real double root, 55-56 
two distinct real-roots, 54-55 
differential operators, 60-62 
Euler—Cauchy equations, 71-74 
existence and uniqueness of 
solutions, 74-79 
general solution, 49-51, 77-78 
initial value problem, 49-50 
modeling free oscillations of 
mass-spring system, 62—70 
particular solution, 49-51 
reduction of order, 51-52 
Wronskian, 75—78 

Hooke, Robert, 62 

Hooke’s law, 62 

Householder, Alston Scott, 888n.11 

Householder’s tridiagonalization 

method, 888-892 
Hyperbolic analytic functions 
(conformal mapping), 750-754 

Hyperbolic cosine, 635, 752 

Hyperbolic functions, 635, 642 
formula for, A65—A66 
inverse, 640 
Taylor series, 695 

Hyperbolic PDEs: 
defined, 923 
numeric analysis, 942—945 


Hyperbolic sine, 635, 752 
Hypergeometric distributions, 
1042-1044, 1061 
Hypergeometric equations, 167, 
185-187 
Hypergeometric functions, 167, 186 
Hypergeometric series, 186 
Hypothesis, 1077 
Hypothesis testing (in statistics), 
1063, 1077-1087 
comparison of means, 1084-1085 
comparison of variances, 1086 
errors in tests, 1080-1081 
for mean of normal distribution 
with known variance, 
1081-1083 
for mean of normal distribution 
with unknown variance, 
1083-1084 
one- and two-sided alternatives, 
1079-1080 


Idempotent matrices, 270 
Identity mapping, 745 
Identity matrices, 268 
Identity operator (second-order 
homogeneous linear ODEs), 60 
Ill-conditioned equations, 805 
Ill-conditioned problems, 864 
Ill-conditioned systems, 864, 865, 
899 
Ill-conditioning (linear systems), 
864-872 
condition number of a matrix, 
868-870 
matrix norms, 866-868 
vector norms, 866 
Image: 
conformal mapping, 737 
linear transformations, 313 
Imaginary axis (complex plane), 611 
Imaginary part (complex numbers), 
609 
Imaginary unit, 609 
Impedance (RLC circuits), 95 
Implicit formulas, 913 
Implicit method: 
backward Euler scheme as, 909 
for hyperbolic PDEs, 943 
Implicit solution, 21 
Improper integrals: 
defined, 205 
residue integration, 726-732 
Improper node, 142 
Improved Euler’s method: 
error of, 904, 906, 908 
first-order ODEs, 902-904 
Impulse, of a force, 225 
short impulses, 225-226 
unit impulse function, 226 


Incidence matrices (graphs and 
digraphs), 975 
Incident edges, 971 
Inclusion theorems: 
defined, 882 
matrix eigenvalue problems, 
879-884 
Incomplete gamma functions, 
formula for, A67 
Inconsistent linear systems, 277 
Indefinite (quadratic form), 346 
Indefinite integrals: 
defined, 643 
existence of, 656-658 
Indefinite integration (complex line 
integral), 646-647 
Independence: 
of path, 669 
of path in domain (integrals), 470, 
655 
of random variables, 1055-1056 
Independent events, 1022-1023, 
1061 
Independent sample values, 1064 
Independent variables: 
in calculus, 393 
in regression analysis, 1103 
Indicial equation, 181-183, 188, 202 
Indirect methods (solving linear 
systems), 858, 898 
Inference, statistical, 1059, 1063 
Infinite dimensional vector space, 
311 
Infinite populations, 1044 
Infinite sequences: 
bounded, A93—A95 
monotone real, A72—A73 
power series, 671-673 
Infinite series, 673-674 
Infinity: 
analytic of singular at, 718-719 
point at, 718 
Initial conditions: 
first-order ODEs, 6, 7, 44 
heat equation, 559, 568, 569 
higher-order linear ODEs: 
homogeneous, 107 
nonhomogeneous, 117 
one-dimensional heat equation, 
559 
PDEs, 541, 605 
second-order homogeneous linear 
ODEs, 49-50, 104 
systems of ODEs, 137 
two-dimensional wave equation, 
577 
vibrating string, 545 
Initial point (vectors), 355 
Initial value problem (IVP): 
defined, 6 
first-order ODEs, 6, 39, 44, 901 
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Initial value problem (IVP): (Cont.) 
bell-shaped curve, 13 
existence and uniqueness of 

solutions for, 38-43 
higher-order linear ODEs, 123 
homogeneous, 107-108 
nonhomogeneous, 117 
Laplace transforms, 213-216 
for RLC circuit, 99 
second-order homogeneous linear 
ODEs, 49, 74-75, 104 
systems of ODEs, 137 
Injective mapping, 737n.1 
Inner product (dot product), 312 
for complex vectors, 349 
invariance of, 336 
vector differential calculus, 
361-367, 410 
applications, 364-366 
orthogonality, 361-363 

Inner product spaces, 311-313 

Input (driving force), 27, 86, 214 

Instability, numeric vs. mathematical, 
796 

Integrals, see Line integrals 

Integral equations: 

defined, 236 
Laplace transforms, 236-237 

Integral of a function, Laplace 
transforms of, 212-213 

Integral transforms, 205, 518 

Integrand, 414, 644 

Integrating factors, 23-26, 45 

defined, 24 
finding, 24-26 
Integration. See also Complex 
integration 
constant of, 18 
of Laplace transforms, 238-240 
numeric, 827-838 
adaptive, 835-836 
Gauss integration formulas, 
836-838 
rectangular rule, 828 
Simpson’s rule, 831-835 
trapezoidal rule, 828-831 
termwise, of power series, 687, 
688 

Intermediate value theorem, 807-808 

Intermediate variables, 393 

Intermittent harvesting, 36 

INTERPOL, ALGORITHM, 814 

Interpolation, 529 

defined, 808 
numeric analysis, 808-820, 842 
equal spacing, 815-819 
Lagrange, 809-812 
Newton’s backward difference 
formula, 818-819 
Newton’s divided difference, 
812-815 


Interpolation (Cont.) 
Newton’s forward difference 
formula, 815-818 
spline, 820-827 
Interpolation polynomial, 808, 842 
Interquartile range, 1013 
Intersection, of events, 1016, 1017 
Intervals. See also Confidence 
intervals 
class, 1012 
closed, A72n.3 
convergence, 171, 683 
open, 4, A72n.3 
Interval estimates, 1065 
Invariance, of curl, A85—A88 
Invariant rank, 283 
Invariant subspace, 878 
Inverse cosine, 640 
Inverse cotangent, 640 
Inverse Fourier cosine transform, 518 
Inverse Fourier sine transform, 519 
Inverse Fourier sine transform 
method, 519 
Inverse Fourier transform, 524 
Inverse hyperbolic function, 640 
Inverse hyperbolic sine, 640 
Inverse mapping, 741, 745 
Inverse of a matrix, 128, 301-309, 
321 
cancellation laws, 306-307 
determinants of matrix 
products, 307-308 
formulas for, 304—306 
Gauss—Jordan method, 
302-304, 856-857 
Inverse sine, 640 
Inverse tangent, 640 
Inverse transform, 205, 253 
Inverse transformation, 315 
Inverse trigonometric function, 640 
Irreducible, 883 
Irregular boundary (elliptic PDEs), 
933-935 
Trrotational flow, 774 
Isocline, 10 
Isolated critical point, 152 
Isolated essential singularity, 715 
Isolated singularity, 715 
Isotherms, 36, 38, 402, 767 
Iteration (iterative) methods: 
numeric analysis, 798-808 
fixed-point iteration, 798-801 
Newton’s (Newton—Raphson) 
method, 801-805 
secant method, 805-806 
speed of convergence, 804-805 
numeric linear algebra, 858-864, 
898 
Gauss-Seidel iteration, 858-862 
Jacobi iteration, 862-863 
IVP, see Initial value problem 
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Jacobi, Carl Gustav Jacob, 430n.3 
Jacobians, 430, 741 

Jacobi iteration, 862-863 

Jordan, Wilhelm, 302n.3 
Joukowski airfoil, 739-740 


Kantorovich, Leonid Vitaliyevich, 
959n.1 

KCL (Kirchhoff’s Current Law), 
93n.7, 274 

Kernel, 205 

Kinetic friction, coefficient of, 19 

Kirchhoff, Gustav Robert, 93n.7 

Kirchhoff’s Current Law (KCL), 
93n.7, 274 

Kirchhoff’s law, 991 

Kirchhoff’s Voltage Law (KVL), 29, 
93, 274 

Koopmans, Tjalling Charles, 959n.1 

Kreyszig, Erwin, 855n.3 

Kronecker, Leopold, 500n.5 

Kronecker delta, A85 

Kronecker symbol, 500 

Kruskal, Joseph Bernard, 985n.5 

KRUSKAL, ALGORITHM, 985 

Kruskal’s Greedy algorithm, 
984-988, 1008 

kth backward difference, 818 

kth central moment, 1038 

kth divided difference, 813 

kth forward difference, 815-816 

kth moment, 1038, 1065 

Kublanovskaya, V. N., 892 

Kutta, Wilhelm, 905n.1 

Kutta’s third-order method, 911 

KVL, see Kirchhoff’s Voltage Law 


Lagrange, Joseph Louis, 51n.1 
Lagrange interpolation, 809-812 
Lagrange’s form, 812, 842 
Laguerre, Edmond, 504n.7 
Laguerre polynomials, 241, 504 
Laguerre’s equation, 240-241 
LAPACK, 789 
Laplace, Pierre Simon Marquis de, 
204n.1 
Laplace equation, 400, 564, 593-600, 
642, 923 
boundary value problem in 
spherical coordinates, 
594-596 
complex analysis, 628-629 
in cylindrical coordinates, 
593-594 
Fourier—Legendre series, 596-598 
heat equation, 564 
numeric analysis, 922-936, 948 
ADI method, 928-930 
difference equations, 923-925 
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Laplace equation (Cont.) 
Dirichlet problem, 925-928, 
934-935 
Liebmann’s method, 926-928 
in spherical coordinates, 594 
theory of solutions of, 460, 786. 
See also Potential theory 
two-dimensional heat equation, 
564 
two-dimensional problems, 759 
uniqueness theorem for, 462 
Laplace integrals, 516 
Laplace operator, 401. See also 
Laplacian 
Laplace transforms, 203-253 
convolution, 232—237 
defined, 204, 205 
of derivatives, 211—212 
differentiation of, 238-240 
Dirac delta function, 226-228 
existence, 209-210 
first shifting theorem (s-shifting), 
208-209 
general formulas, 248 
initial value problems, 213-216 
integral equations, 236-237 
of integral of a function, 212-213 
integration of, 238-240 
linearity of, 206-208 
notation, 205 
ODEs with variable coefficients, 
240-241 
partial differential equations, 
600-603 
partial fractions, 228-230 
second shifting theorem 
(t-shifting), 219-223 
short impulses, 225-226 
systems of ODEs, 242-247 
table of, 249-251 
uniqueness, 210 
unit step function (Heaviside 
function), 217-219 
Laplacian, 400, 463, 605, A76 
in cylindrical coordinates, 
593-594 
heat equation, 557 
Laplace’s equation, 593 
in polar coordinates, 585-592 
in spherical coordinates, 594 
of u in polar coordinates, 586 
Lattice points, 925-926 
Laurent, Pierre Alphonse, 708n.1 
Laurent series, 708-719, 734 
analytic or singular at infinity, 
718-719 
point at infinity, 718 
Riemann sphere, 718 
singularities, 715-717 
zeros of analytic functions, 717 


Laurent’s theorem, 709 
LCL (lower control limit), 1088 
Least squares approximation, of a 
function, 875-876 
Least squares method, 872-876, 899 
Least squares principle, 1103 
Lebesgue, Henri, 876n.5 
Left-handed Cartesian coordinate 
system, 369, 370, A84 
Left-hand limit (Fourier series), 480 
Left-sided tests, 1079, 1082 
Legendre, Adrien-Marie, 175n.1, 
1103 
Legendre function, 175 
Legendre polynomials, 167, 177-179, 
202 
Legendre’s equation, 167, 175— 177, 
201, 202 
Laplace’s equation, 595-596 
special, 169-170 
Leibniz, Gottfried Wilhelm, 15n.3 
Leibniz test for real series, A73—A74 
Length: 
curves, 385 
vectors, 355, 356, 410 
Leonardo of Pisa, 690n.2 
Leontief, Wassily, 334n.1 
Leontief input-output model, 334 
Leslie model, 331 
Level surfaces, 380, 398 
LFTs, see Linear fractional 
transformations 
Libby, Willard Frank, 13n.2 
Liebmann’s method, 926-928 
Likelihood function, 1066 
Limit (sequences), 672 
Limit cycle, 158-159, 621 
Limit /, 378 
Limit point, A93 
Limit vector, 378 
Linear algebra, 255. See also 
Numeric linear algebra 
determinants, 293-301 
Cramer’s rule, 298-300 
general properties of, 295-298 
of matrix products, 307-308 
second-order, 291-292 
third-order, 292—293 
inverse of a matrix, 301-309 
cancellation laws, 306-307 
determinants of matrix 
products, 307-308 
formulas for, 304-306 
Gauss—Jordan method, 
302-304 
linear systems, 272-274 
back substitution, 274-276 
elementary row operations, 277 
Gauss elimination, 274-280 
homogeneous, 290-291 


Linear algebra (Cont.) 
nonhomogeneous, 291 
solutions of, 288-291 

matrices and vectors, 257—262 
addition and scalar 
multiplication of, 
259-261 
diagonal matrices, 268 
linear independence and 
dependence of vectors, 
282-283 
matrix multiplication, 
263-266, 269-279 
notation, 258 
rank of, 283-285 
symmetric and skew-symmetric 
matrices, 267-268 
transposition of, 266-267 
triangular matrices, 268 
matrix eigenvalue problems, 
322-353 
applications, 329-334 
complex matrices and forms, 
346-352 
determining eigenvalues and 
eigenvectors, 323-329 
diagonalization of matrices, 
341-342 
eigenbases, 339-341 
orthogonal matrices, 337-338 
orthogonal transformations, 336 
quadratic forms, 343-344 
symmetric and skew- 
symmetric matrices, 
334-336 
transformation to principal 
axes, 344 
vector spaces: 
inner product spaces, 311-313 
linear transformations, 
313-317 
real, 309-311 
special, 285-287 
Linear combination: 
homogeneous linear ODEs: 
higher-order, 107 
second-order, 48 
of matrices, 129, 271 
of vectors, 129, 282 
of vectors in vector space, 311 

Linear dependence, of vectors, 
282-283 

Linear element, 386 

Linear equations, systems of, see 
Linear systems 

Linear fractional transformations 
(LFTs), 742-750, 757 

extended complex plane, 744-745 
mapping standard domains, 
747-7150 
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Linear independence: 
scalar triple product, 373 
of vectors, 282-283 
Linear inequalities, 954 
Linear interpolation, 809-810 
Linearity: 
Fourier transforms, 526-527 
Laplace transforms, 206-208 
line integrals, 645 
Linearity principle, see Superposition 
principle 
Linearization, 152-155 
Linearized system, 153 
Linearly dependent functions: 
higher-order homogeneous linear 
ODEs, 106, 109 
second-order homogeneous linear 
ODEs, 50, 75 
Linearly dependent sets, 129, 311 
Linearly dependent vectors, 282-283, 
285 
Linearly independent functions: 
higher-order homogeneous linear 
ODEs, 106, 109, 113 
second-order homogeneous linear 
ODEs, 50, 75 
Linearly independent sets, 128-129, 
311 
Linearly independent vectors, 282-283 
Linearly related variables, 1109 
Linear mapping, 314. See also Linear 
transformations 
Linear ODEs, 45, 46 
first order, 27-36 
Bernoulli equation, 31-33 
homogeneous, 28 
nonhomogeneous, 28-29 
population dynamics, 33-34 
higher-order, 105-123 
homogeneous, 105-116 
nonhomogeneous, 116-122 
higher-order homogeneous, 105 
second-order, 46—104 
homogeneous, 46-78, 103 
nonhomogeneous, 79-102, 103 
Linear operations: 
Fourier cosine and sine 
transforms as, 520 
integration as, 645 
Linear operators (second-order 
homogeneous linear ODEs), 61 
Linear optimization, see Constrained 
(linear) optimization 
Linear PDEs, 541 
Linear programming problems, 954-958 
normal form of problems, 955-957 
simplex method, 958-968 
degenerate feasible solution, 
962-965 
difficulties in starting, 965-968 


Linear systems, 138-139, 165, 
272-274, 320, 845 
back substitution, 274-276 
defined, 267, 845 
elementary row operations, 277 
Gauss elimination, 274—280, 
844-852 
applications, 277-180 
back substitution, 274-276 
elementary row operations, 277 
operation count, 850-851 
row echelon form, 279-280 
Gauss—Jordan elimination, 


856-857 
homogeneous, 138, 165, 272, 
290-291 
constant-coefficient systems, 
140-151 


matrices and vectors, 124-130 
ill-conditioning, 864-872 
condition number of a matrix, 
868-870 
matrix norms, 866-868 
vector norms, 866 
iterative methods, 858-864 
Gauss-Seidel iteration, 
858-882 
Jacobi iteration, 862-863 
LU-factorization, 852-855 
Cholesky’s method, 855-856 
of m equations in n unknowns, 272 
nonhomogeneous, 138, 160-163, 
272, 290, 291 
solutions of, 288-291, 898 
Linear transformations, 320 
motivation of multiplication by, 
265-266 
vector spaces, 313-317 
Line integrals, 643-652, 669 
basic properties of, 645 
bounds for, 650-651 
definition of, 414, 643-645 
existence of, 646 
indefinite integration and 
substitution of limits, 
646-647 
path dependence of, and 
integration around closed 
curves, 421-425 
representation of a path, 647-650 
vector integral calculus, 413-419 
definition and evaluation of, 
414-416 
path dependence of, 418-426 
work done by a force, 416-417 
Lines of constant revenue, 954 
Lines of force, 760-762 
LINPACK, 789 
Liouville, Joseph, 499n.4 
Liouville’s theorem, 666-667 
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Lipschitz, Rudolf, 42n.9 

Lipschitz condition, 42 

Ljapunoy, Alexander Michailovich, 
149n.2 

Local error, 830 

Local maximum (unconstrained 
optimization), 952 

Local minimum (unconstrained 
optimization), 951 

Local truncation error, 902 

Logarithm, 636-639 

natural, 636-638, 642, A63 
Taylor series, 695 

Logarithmic decrement, 70 

Logarithmic integral, formula for, A69 

Logarithm of base ten, formula for, 
A63 

Logistic equation, 32-33 

Longest path, 976 

Loss of significant digits (numeric 
analysis), 793-794 

Lotka, Alfred J., 155n.3 

Lotka—Volterra population model, 
155-156 

Lot tolerance percent defective 
(LTPD), 1094 

Lower confidence limits, 1068 

Lower control limit (LCL), 1088 

Lower triangular matrices, 268 

LTPD (lot tolerance percent 
defective), 1094 

LU-factorization (linear systems), 
852-855 


Machine numbers, 792 
Maclaurin, Colin, 690n.2, 712 
Maclaurin series, 690, 694-696 
Main diagonal: 
determinants, 294 
matrix, 125, 258 
Malthus, Thomas Robert, 5n.1 
Malthus’ law, 5, 33 
Maple, 789 
Maple Computer Guide, 789 
Mapping, 313, 736, 737, 757 
bijective, 737n.1 
conformal, 736-757 
boundary value problems, 
763-767, A96 
defined, 738 
geometry of analytic functions, 
737-7142 
linear fractional 
transformations, 
742-750 
Riemann surfaces, 754-756 
by trigonometric and 
hyperbolic analytic 
functions, 750-754 
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Mapping (Cont.) 
of disks, 748-750 
fixed points of, 745 
of half-planes onto half-planes, 748 
identity, 745 
injective, 737n.1 
inverse, 741, 745 
linear, 314. See also Linear 
transformations 
one-to-one, 737n.1 
spectral mapping theorem, 878 
surjective, 737n.1 
Marconi, Guglielmo, 63n.3 
Marginal distributions, 1053-1055, 
1062 
of continuous distributions, 1055 
of discrete distributions, 
1053-1054 
Mariotte, Edme, 19n.5 
Markov, Andrei Andrejevitch, 270n.1 
Markov process, 270, 331 
MATCHING, ALGORITHM, 1003 
Matching, 1008 
assignment problems, 1001 
complete, 1002 
maximum cardinality, 1001, 1008 
Mathcad, 789 
Mathematica, 789 
Mathematica Computer Guide, 789 
Mathematical models, see Models 
Mathematical modeling, see 
Modeling 
Mathematical statistics, 1009, 
1063-1113 
acceptance sampling, 1092-1096 
errors in, 1093-1094 
rectification, 1094-1095 
confidence intervals, 1068-1077 
for mean of normal distribution 
with known variance, 
1069-1071 
for mean of normal distribution 
with unknown variance, 
1071-1073 
for parameters of distributions 
other than normal, 1076 
for variance of a normal 
distribution, 1073-1076 
correlation analysis, 1108-1111 
defined, 1103 
test for correlation coefficient, 
1110-1111 
defined, 1063 
goodness of fit, 1096-1100 
hypothesis testing, 1077-1087 
comparison of means, 
1084-1085 
comparison of variances, 1086 
errors in tests, 1080-1081 


for mean of normal distribution 


with known variance, 
1081-1083 


for mean of normal distribution 


with unknown variance, 
1083-1084 
one- and two-sided 
alternatives, 1079-1080 
main purpose of, 1015 
nonparametric tests, 1100-1102 
point estimation of parameters, 
1065-1068 
quality control, 1087-1092 
for mean, 1088-1089 
for range, 1090-1091 
for standard deviation, 1090 
for variance, 1089-1090 
random sampling, 1063-1065 
regression analysis, 1103-1108 
confidence intervals in, 
1107-1108 
defined, 1103 
Matlab, 789 
Matrices, 124-130, 256-262, 320 
addition and scalar multiplication 
of, 259-261 
calculations with, 126-127 
condition number of, 868-870 
definitions and terms, 125-126, 
128, 257 
diagonal, 268 
diagonalization of, 341-342 
eigenvalues, 129-130 
equality of, 126, 259 
fundamental, 139 
inverse of, 128, 301-309, 321 
cancellation laws, 306-307 
determinants of matrix 
products, 307-308 
formulas for, 304-306 
Gauss—Jordan method, 
302-304, 856-857 
matrix multiplication, 127, 
263-266, 269-279 
applications of, 269-279 
cancellation laws, 306-307 
determinants of matrix 
products, 307-308 
scalar, 259-261 
normal, 352, 882 
notation, 258 
orthogonal, 337-338 
rank of, 283-285 
square, 126 
symmetric and skew-symmetric, 
267-268 
transposition of, 266-267 
triangular, 268 
unitary, 347-350, 353 


Matrix eigenvalue problems, 
322-353, 876-896 
applications, 329-334 
choice of numeric method for, 
879 
complex matrices and forms, 
346-352 
determining eigenvalues and 
eigenvectors, 323-329 
diagonalization of matrices, 
341-342 
eigenbases, 339-341 
inclusion theorems, 879-884 
orthogonal matrices, 337-338 
orthogonal transformations, 336 
power method, 885-888 
QR-factorization, 892-896 
quadratic forms, 343-344 
symmetric and skew-symmetric 
matrices, 334-336 
transformation to principal axes, 
344 
tridiagonalization, 888-892 
Matrix multiplication, 127, 263-266, 
269-279 
applications of, 269-279 
cancellation laws, 306-307 
determinants of matrix products, 
307-308 
scalar, 259-261 
Matrix norms, 861, 866-868 
Maximum cardinality matching, 
1001, 1003-1004, 1008 
Maximum flow: 
Ford—Fulkerson algorithm, 
998-1000 
and minimum cut set, 996 
Maximum increase: 
gradient of a scalar field, 398 
unconstrained optimization, 951 
Maximum likelihood estimates 
(MLEs), 1066-1067 
Maximum likelihood method, 
1066-1067, 1113 
Maximum modulus theorem, 782-784 
Maximum principle, 783 
Mean(s), 1013-1014, 1061 
comparison of, 1084-1085 
control chart for, 1088-1089 
of normal distributions: 
confidence intervals for, 
1069-1073 
hypothesis testing for, 
1081-1084 
probability distributions, 
1035-1039 
addition of, 1057-1058 
transformation of, 1036-1037 
sample, 1064 
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Mean square convergence (orthogonal 
series), 507-508 
Mean value (fluid flow), 774n.1 
Mean value property: 
analytic functions, 781-782 
harmonic functions, 782 
Mean value theorem, 395 
for double integrals, 427 
for surface integrals, 448 
for triple integrals, 456-457 
Median, 1013, 1100-1101 
Mendel, Gregor, 1100 
Meromorphic function, 719 
Mesh incidence matrix, 262 
Mesh points (lattice points, nodes), 
925-926 
Mesh size, 924 
Method of characteristics (PDEs), 555 
Method of least squares, 872-876, 
899 
Method of moments, 1065 
Method of separating variables, 
12-13 
circular membrane, 587 
partial differential equations, 
545-553, 605 
Fourier series, 548-551 
satisfying boundary conditions, 
546-548 
two ODEs from wave 
equation, 545-546 
vibrating string, 545-546 
Method of steepest descent, 952-954 
Method of undetermined coefficients: 
higher-order homogeneous linear 
ODEs, 115, 123 
nonhomogeneous linear systems 
of ODEs, 161 
second-order nonhomogeneous 
linear ODEs, 81-85, 104 
Method of variation of parameters: 
higher-order nonhomogeneous 
linear ODEs, 118-120, 123 
nonhomogeneous linear systems 
of ODEs, 162-163 
second-order nonhomogeneous 
linear ODEs, 99-102, 104 
Minimization (normal form of linear 
optimization problems), 957 
Minimum (unconstrained 
optimization), 951 
Minimum cut set, 996 
Minors, of determinants, 294 
Mixed boundary condition (two- 
dimensional heat equation), 
564 
Mixed boundary value problem, 605, 
923. See also Robin problem 
elliptic PDEs, 931-933 
heat conduction, 768-769 


Mixed type PDEs, 555 
Mixing problems, 14 
MLEs (maximum likelihood 
estimates), 1066-1067 
ML-inequality, 650-651 
Mobius, August Ferdinand, 447n.5 
Mobius strip, 447 
Mobius transformations, 743. See 
also Linear fractional 
transformations (LFTs) 
Models, 2 
Modeling, 1, 2-8, 44 
and concept of solution, 4-6 
defined, 2 
first-order ODEs, 2-8 
initial value problem, 6 
separable ODEs, 13-17 
typical steps of, 6-7 
and unifying power of 
mathematics, 766 
Modification Rule (method of 
undetermined coefficients): 
higher-order homogeneous linear 
ODEs, 115-116 
second-order nonhomogeneous 
linear ODEs, 81, 83 
Modulus (complex numbers), 613 
Moments, method of, 1065 
Moments of inertia, of a region, 429 
Moment vector (vector moment), 
371 
Monotone real sequences, 
A72-A73 
Moore, Edward Forrest, 977n.2 
MOORE, ALGORITHM, 977 
Moore’s BFS algorithm, 977-980, 
1008 
Morera’s theorem, 667 
Moulton, Forest Ray, 913n.3 
Multinomial distribution, 1045 
Multiple complex roots, 115 
Multiple points, curves with, 383 
Multiplication: 
of complex numbers, 609, 610, 
615 
in conditional probability, 
1022-1023 
matrix, 127, 263-266 
applications of, 269-279 
cancellation laws, 306-307 
determinants of matrix 
products, 307-308 
scalar, 259-261 
of means, 1057-1058 
of power series, 687 
scalar, 126-127, 259-261, 310 
termwise, 173, 687 
of transforms, 232. See also 
Convolution 
Multiplicity, algebraic, 326, 878 
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Multiply connected domains, 652, 
653 
Cauchy’s integral formula, 
662-663 
Cauchy’s integral theorem, 
658-659 
Multistep methods, 911-915, 947 
Adams—Bashforth methods, 
911-914 
Adams—Moulton methods, 
913-914 
defined, 908 
first-order ODEs, 911 
Mutually exclusive events, 1016, 
1021 
m X n matrix, 258 
Nabla, 396 
NAG (Numerical Algorithms Group, 
Inc.), 789 


National Institute of Standards and 
Technology (NIST), 789 
Natural condition (spline 
interpolation), 823 
Natural frequency, 63 
Natural logarithm, 636-638, 642, 
A63 
Natural spline, 823 
n-dimensional vector spaces, 311 
Negative (scalar multiplication), 260 
Negative definite (quadratic form), 
346 
Neighborhood, 619, 720 
Net flow, through cut set, 994-995 
NETLIB, 789 
Networks: 
defined, 991 
flow problems in, 991-997 
cut sets, 994-996 
flow augmenting paths, 
992-993 
paths, 992 
Neumann, Carl, 198n.7 
Neumann, John von, 959n.1 
Neumann boundary condition, 564 
Neumann problem, 605, 923 
elliptic PDEs, 931 
Laplace’s equation, 593 
two-dimensional heat equation, 
564 
Neumann’s function, 198 
NEWTON, ALGORITHM, 802 
Newton, Sir Isaac, 15n.3 
Newton—Cotes formulas, 833, 843 
Newton’s (Gregory—Newton’s) 
backward difference 
interpolation formula, 818-819 
Newton’s divided difference 
interpolation, 812-815, 842 
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Newton’s divided difference 
interpolation formula, 814-815 
Newton’s (Gregory—Newton’s) 
forward difference 
interpolation formula, 
815-818, 842 
Newton’s law of cooling, 15-16 
Newton’s law of gravitation, 377 
Newton’s (Newton—Raphson) 
method, 801-805, 842 
Newton’s second law, 11, 63, 245, 
544, 576 
Neyman, Jerzy, 1068n.1, 1077n.2 
Nicolson, Phyllis, 938n.5 
Nicomedes, 391n.4 
Nilpotent matrices, 270 
NIST (National Institute of Standards 
and Technology), 789 
Nodal incidence matrix, 262 
Nodal lines, 580-581, 588 
Nodes, 165, 925-926 
degenerate, 145-146 
improper, 142 
interpolation, 808 
proper, 143 
spline interpolation, 820 
trapezoidal rule, 829 
vibrating string, 547 
Nonbasic variables, 960 
Nonconservative physical systems, 
422 
Nonhomogeneous linear ODEs: 
convolution, 235-236 
first-order, 28-29 
higher-order, 106, 116-122 
second-order, 79-102 
defined, 47 
method of undetermined 
coefficients, 81-85 
modeling electric circuits, 
93-99 
modeling forced oscillations, 
85-92 
particular solution, 80 
solution by variation of 
parameters, 99-102 
Nonhomogeneous linear systems, 
138, 160-163, 166, 272, 290, 
291, 845 
method of undetermined 
coefficients, 161 
method of variation of parameters, 
162-163 
Nonhomogeneous PDEs, 541 
Nonlinear ODEs, 46 
first-order, 27 
higher-order homogeneous, 
105 
second-order, 46 
Nonlinear PDEs, 541 


Nonlinear systems, qualitative 
methods for, 152-160 
linearization, 152-155 
Lotka—Volterra population model, 
155-156 
transformation to first-order 
equation in phase plane, 
157-159 
Nonparametric tests (statistics), 
1100-1102, 1113 
Nonsingular matrices, 128, 301 
Norm(s): 
matrix, 861, 866-868 
orthogonal functions, 500 
vector, 312, 355, 410, 866 
Normal accelerations, 391 
Normal acceleration vector, 387 
Normal derivative, 437 
defined, 437 
mixed problems, 768, 931 
Neumann problems, 931 
solutions of Laplace’s equation, 
460 
Normal distributions, 1045-1051, 
1062 
as approximation of binomial 
distribution, 1049-1050 
confidence intervals: 
for means of, 1069-1073 
for variances of, 1073-1076 
distribution function, 1046-1047 
means of: 
confidence intervals for, 
1069-1073 
hypothesis testing for, 
1081-1084 
numeric values, 1047-1048 
tables, A101—A102 
two-dimensional, 1110 
working with normal tables, 
1048-1049 
Normal equations, 873, 1105-1106 
Normal form (linear optimization 
problems), 955-957, 959, 969 
Normalizing, eigenvectors, 326 
Normal matrices, 352, 882 
Normal mode: 
circular membrane, 588 
vibrating string, 547-548 
Normal plane, 390 
Normal random variables, 1045 
Normal vectors, 366, 441 
Not rejecting a hypothesis, 1081 
No trend hypothesis, 1101 
nth order linear ODEs, 105, 123 
nth-order ODEs, 134-135 
nth partial sum, 170 
Fourier series, 495 
of series, 673 
nth roots, 616 


nth roots of unity, 617 
Null hypothesis, 1078 
Nullity, 287, 291 
Null space, 287, 291 
Numbers: 
acceptance, 1092 
Bernoulli’s law of large numbers, 
1051 
chromatic, 1006 
complex, 608-619, 641 
addition of, 609, 610 
conjugate, 612 
defined, 608 
division of, 610 
multiplication of, 609, 610 
polar form of, 613-618 
subtraction of, 610 
condition, 868—870, 899 
Fibonacci, 690 
floating-point form of, 791-792 
machine, 792 
random, 1064 
Number of degrees of freedom, 1071, 
1074 
Numerics, see Numeric analysis 
Numerical Algorithms Group, Inc. 
(NAG), 789 
Numerically stable algorithms, 796, 
842 
Numerical Recipes, 789 
Numeric analysis (numerics), 
787-843 
algorithms, 796 
basic error principle, 796 
error propagation, 795 
errors of numeric results, 794-795 
floating-point form of numbers, 
791-792 
interpolation, 808-820 
equal spacing, 815-819 
Lagrange, 809-812 
Newton’s backward difference 
formula, 818-819 
Newton’s divided difference, 
812-815 
Newton’s forward difference 
formula, 815-818 
spline, 820-827 
loss of significant digits, 793-794 
numeric differentiation, 838-839 
numeric integration, 827-838 
adaptive, 835-836 
Gauss integration formulas, 
836-838 
rectangular rule, 828 
Simpson’s rule, 831-835 
trapezoidal rule, 828-831 
for ODEs, 901-922 
first-order, 901-915 
higher order, 915-922 
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numeric integration (Cont.) 
for PDEs, 922-945 
elliptic, 922-936 
hyperbolic, 942-945 
parabolic, 936-942 
roundoff, 792-793 
software for, 788-789 
solution of equations by iteration, 
798-808 
fixed-point iteration, 798-801 
Newton’s (Newton—Raphson) 
method, 801-805 
secant method, 805-806 
speed of convergence, 804-805 
spline interpolation, 820-827 
Numeric differentiation, 838-839 
Numeric integration, 827-838 
adaptive, 835-836 
Gauss integration formulas, 
836-838 
rectangular rule, 828 
Simpson’s rule, 831-835 
trapezoidal rule, 828-831 
Numeric linear algebra, 844-899 
curve fitting, 872-876 
least squares method, 872-876 
linear systems, 845 
Gauss elimination, 844-852 
Gauss—Jordan elimination, 
856-857 
ill-conditioning norms, 
864-872 
iterative methods, 858-864 
LU-factorization, 852-855 
matrix eigenvalue problems, 
876-896 
inclusion theorems, 879-884 
power method, 885-888 
QR-factorization, 892-896 
tridiagonalization, 888-892 
Numeric methods: 
choice of, 791, 879 
defined, 791 
n Xn matrix, 125 
Nystrom, E. J., 919 


Objective function, 951, 969 

OCs (operating characteristics), 1081 

OC curve, see Operating 
characteristic curve 

Odd functions, 486-488 

Odd periodic extension, 488-490 

ODEs, see Ordinary differential 
equations 

Ohm, Georg Simon, 93n.7 

Ohm’s law, 29 

One-dimensional heat equation, 559 

One-dimensional wave equation, 
544-545 


One-parameter family of curves, 36-37 
One-sided alternative (hypothesis 
testing), 1079-1080 
One-sided tests, 1079 
One-step methods, 908, 911, 947 
One-to-one mapping, 737n.1 
Open annulus, 619 
Open circular disk, 619 
Open integration formula, 838 
Open intervals, 4, A72n.3 
Open Leontief input-output model, 
334 
Open set, in complex plane, 620 
Operating characteristic curve (OC 
curve), 1081, 1092, 1095 
Operating characteristics (OCs), 1081 
Operational calculus, 60, 203 
Operation count (Gauss elimination), 
850 
Operators, 60-61, 313 
Optimal solutions (normal form of 
linear optimization problems), 
957 
Optimization: 
combinatorial, 970, 975—1008 
assignment problems, 
1001-1006 
flow problems in networks, 
991-997 
Ford—Fulkerson algorithm for 
maximum flow, 
998-1001 
shortest path problems, 
975-980 
constrained (linear), 951, 954-968 
normal form of problems, 
955-957 
simplex method, 958-968 
unconstrained: 
basic concepts, 951-952 
method of steepest descent, 
952-954 
Optimization methods, 949 
Optimization problems, 949, 
954-958 
normal form of problems, 
955-957 
objective, 951 
simplex method, 958-968 
degenerate feasible solution, 
962-965 
difficulties in starting, 965-968 
Order: 
and complexity of algorithms, 978 
Gauss elimination, 850 
of iteration process, 804 
of PDE, 540 
singularities, 714 
Ordering (Greedy algorithm), 987 
Order statistics, 1100 
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Ordinary differential equations 
(ODEs), 44 
autonomous, 11, 33 
defined, 1, 3-4 
first-order, 2-45 
direction fields, 9-10 
Euler’s method, 10-11 
exact, 20-27 
geometric meanings of, 9-12 
initial value problem, 38—43 
linear, 27-36 
modeling, 2-8 
numeric analysis, 901-915 
orthogonal trajectories, 36-38 
separable, 12—20 
higher-order linear, 105-123 
homogeneous, 105-116, 123 
nonhomogeneous, 116-123 
systems of, see Systems of 
ODEs 
Laplace transforms, 203-253 
convolution, 232—237 
defined, 204, 205 
of derivatives, 211-212 
differentiation of, 238-240 
Dirac delta function, 226-228 
existence, 209-210 
first shifting theorem 
(s-shifting), 208-209 
general formulas, 248 
initial value problems, 
213-216 
integral equations, 236-237 
of integral of a function, 
212-213 
integration of, 238-240 
linearity of, 206-208 
notation, 205 
ODEs with variable 
coefficients, 240-241 
partial differential equations, 
600-603 
partial fractions, 228-230 
second shifting theorem 
(t-shifting), 219-223 
short impulses, 225-226 
systems of ODEs, 242-247 
table of, 249-251 
uniqueness, 210 
unit step function (Heaviside 
function), 217-219 
linear, 46 
nonlinear, 46 
numeric analysis, 901-922 
first-order ODEs, 901-915 
higher order ODEs, 915-922 
second-order linear, 46-104 
homogeneous, 46-79 
nonhomogeneous, 79-102 
second-order nonlinear, 46 
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Ordinary differential equations (Cont.) 
series solutions of ODEs, 167-202 
Bessel functions, 187—194, 
196-200 
Bessel’s equation, 187—200 
Frobenius method, 180-187 
Legendre polynomials, 
177-179 
Legendre’s equation, 175— 179 
power series method, 167-175 
systems of, 124-166 
basic theory, 137-139 
constant-coefficient, 140-151 
conversion of nth-order ODEs 
to, 134-135 
homogeneous, 138 
Laplace transforms, 242-247 
linear, 124-130, 138-151, 
160-163 
matrices and vectors, 124-130 
as models of applications, 
130-134 
nonhomogeneous, 138, 160-163 
nonlinear, 152-160 
in phase plane, 124, 141-146, 
157-159 
qualitative methods for 
nonlinear systems, 
152-160 
Orientable surfaces, 446-447 
Oriented curve, 644 
Oriented surfaces, integrals over, 
446-447 
Origin (vertex), 980 
Orthogonal, to a vector, 362 
Orthogonal coordinate curves, A74 
Orthogonal expansion, 504 
Orthogonal functions: 
defined, 500 
Sturm-—Liouville Problems, 
500-503 
Orthogonality: 
trigonometric system, 479-480, 538 
vector differential calculus, 
361-363 
Orthogonal matrices, 335, 337-338, 
353, A85n.2 
Orthogonal polynomials, 179 
Orthogonal series (generalized 
Fourier series), 504-510 
completeness, 508-509 
mean square convergence, 
507-508 
Orthogonal trajectories: 
defined, 36 
first-order ODEs, 36—38 
Orthogonal transformations, 336, 
A85n.2 
Orthogonal vectors, 312, 362, 410 
Orthonormal functions, 500, 501, 508 


Orthonormal system, 337 
Oscillations: 
forced, 85-92 
free, 62-70 
harmonic, 63-64 
second-order linear ODEs: 
homogeneous, 62-70 
nonhomogeneous, 85-92 
Osculating plane, 389, 390 
Outcomes: 
of experiments, 1015, 1060 
probability theory, 1015 
Outer normal derivative, 460, 931 
Outliers, 1013-1015 
Output (response to input), 27, 86, 
214 
Overdamping, 65-66 
Overdetermined linear systems, 277 
Overflow (floating-point numbers), 
792 
Overrelaxation factor, 863 


Paired comparison, 1084, 1113 
Pappus, theorem of, 452 
Pappus of Alexandria, 452n.7 
Parabolic PDEs: 
defined, 923 
numeric analysis, 936-942 
Parallelogram law, 357 
Parallel processing of products (on 
computer), 265 
Parameters, 175, 381, 1112 
estimation of, 1063 
point estimation of, 1065-1068 
probability distributions, 
1035 
of a sample, 1065 
Parameter curves, 442 
Parametric representations, 381, 
439-44] 
Parseval, Marc Antoine, 497n.3 
Parseval equality, 509 
Parseval’s identity, 497 
Parseval’s theorem, 497 
Partial derivatives, A69—A71 
defined, A69 
first (first order), A71 
second (second order), A71 
third (third order), A71 
of vector functions, 380 
Partial differential equations (PDEs), 
473, 540-605 
basic concepts of, 540-543 
d’Alembert’s solution, 553-556 
defined, 540 
double Fourier series solution, 
577-585 
heat equation, 557-558 
Dirichlet problem, 564-566 


Partial differential equations (Cont.) 
Laplace’s equation, 564 
solution by Fourier integrals, 

568-571 
solution by Fourier series, 
558-563 
solution by Fourier transforms, 
571-574 
steady two-dimensional heat 
problems, 546-566 
unifying power of methods, 
566 
homogeneous, 541 
Laplace’s equation, 593-600 
boundary value problem in 
spherical coordinates, 
594-596 
in cylindrical coordinates, 
593-594 
Fourier—Legendre series, 
596-598 
in spherical coordinates, 594 
Laplace transforms, solution by, 
600-603 
Laplacian in polar coordinates, 
585-592 
linear, 541 
method of separating variables, 
545-553 
Fourier series, 548-551 
satisfying boundary conditions, 
546-548 
two ODEs from wave 
equation, 545-546 
nonhomogeneous, 541 
nonlinear, 541 
numeric analysis, 922-945 
elliptic, 922-936 
hyperbolic, 942-945 
parabolic, 936-942 
ODEs vs., 4 
wave equation, 544-545 
d’Alembert’s solution, 
553-556 
solution by separating 
variables, 545-553 
two-dimensional, 575-584 

Partial fractions (Laplace transforms), 
228-230 

Partial pivoting, 276, 846-848, 898 

Partial sums, of series, 477, 478, 495 

Particular solution(s): 

first-order ODEs, 6, 44 
higher-order homogeneous linear 
ODEs, 106 
nonhomogeneous linear systems, 
160 
second-order linear ODEs: 
homogeneous, 49-51, 104 
nonhomogeneous, 80 
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Partitioning, of a path, 645 
Pascal, Blaise, 391n.4 
Pascal, Etienne, 391n.4 
Paths: 
alternating, 1002 
augmenting, 1002-1003 
closed, 414, 645, 975-976 
deformation of, 656 
directed, 1000 
flow augmenting, 992-993, 998, 
1008 
flow problems in networks, 992 
integration by use of, 647-650 
longest, 976 
partitioning of, 645 
principle of deformation of, 656 
shortest, 976 
shortest path problems, 975-976 
simple closed, 652 
Path dependence (line integrals), 
418-426, 470, 649-650 
defined, 418 
and integration around closed 
curves, 421-425 
Path independence, 669 
Cauchy’s integral theorem, 655 
in a domain D in space, 419 
proof of, A88—A89 
Stokes’s Theorem applied to, 
468 
Path of integration, 414, 644 
Pauli spin matrices, 351 
p-charts, 1091-1092 
PDEs, see Partial differential 
equations 
Pearson, Egon Sharpe, 1077n.2 
Pearson, Karl, 1077, 1086n.4 
Period, 475 
Periodic boundary conditions, 501 
Periodic extensions, 488-490 
Periodic function, 474-475, 538 
Periodic Sturm—Liouville problem, 
501 
Permutations: 
of n things taken k at a time, 
1025 
of n things taken k at a time with 
repetitions, 1025-1026 
probability theory, 1024-1026 
Perron, Oskar, 882n.8 
Perron—Frobenius Theorem, 883 
Perron’s theorem, 334, 882-883 
Pfaff, Johann Friedrich, 422n.1 
Pfaffian form, 422 
p-fold connected domains, 652-653 
Phase angle, 90 
Phase lag, 90 
Phase plane, 134, 165 
linear systems, 141, 148 
nonlinear systems, 152 


Phase plane method, 124 
linear systems: 
critical points, 142-146 
graphing solutions, 141-142 
nonlinear systems, 152 
linearization, 152-155 
Lotka—Volterra population 
model, 155-156 
transformation to first-order 
equation in, 157-159 
Phase plane representations, 134 
Phase portrait, 165 
linear systems, 141-142, 148 
nonlinear systems, 152 
Picard, Emile, 42n.10 
Picard’s Iteration Method, 42 
Picard’s theorem, 716 
Piecewise continuous functions, 209 
Piecewise smooth path of integration, 
414, 645 
Piecewise smooth surfaces, 442, 447 
Pivot, 276, 898, 960 
Pivot equation, 276, 846, 898, 960 
Planar graphs, 1005 
Plane: 
complex, 611 
extended, 718, 744-745 
finite, 718 
sets in, 620 
normal, 390 
osculating, 389, 390 
phase, 134, 165 
linear systems, 141, 148 
nonlinear systems, 152 
rectifying, 390 
tangent, 398, 441-442 
vectors in, 309 
Plane curves, 383 
Planimeters, 436 
Poincaré, Henri, 141n.1, 510n.8 
Points: 
boundary, 426n.2, 620 
branch, 755 
center, 144, 165 
critical, 33, 144, 165 
asymptotically stable, 149 
and conformal mapping, 738, 
757 
constant-coefficient systems of 
ODEs, 142-151 
isolated, 152 
nonlinear systems, 152 
stable, 140, 149 
stable and attractive, 140, 149 
unstable, 140, 149 
equilibrium, 33-34 
fixed, 745, 799 
guidepoints, 827 
at infinity, 718 
initial (vectors), 355 
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Points: (Cont.) 
lattice, 925-926 
limit, A93 
mesh, 925-926 
regular, 181 
regular singular, 180n.4 
saddle, 143, 165 
sample, 1015 
singular, 181, 201 
analytic functions, 693 
regular, 180n.4 
spiral, 144-145, 165 
stagnation, 773 
stationary, 952 
terminal (vectors), 355 
Point estimation of parameters 
(statistics), 1065-1068, 1113 
defined, 1065 
maximum likelihood method, 
1066-1067 
Point set, in complex plane, 620 
Point source (flow modeling), 776 
Point spectrum, 525 
Poisson, Siméon Denis, 779n.2 
Poisson distributions, 1041-1042, 
1061, A100 
Poisson equation: 
defined, 923 
numeric analysis, 922-936 
ADI method, 928-930 
difference equations, 923-925 
Dirichlet problem, 925-928 
mixed boundary value 
problem, 931-933 
Poisson’s integral formula: 
derivation of, 778-778 
potential theory, 777-781 
series for potentials in disks, 
779-780 
Polar coordinates, 431 
Laplacian in, 585-592 
notation for, 594 
two-dimensional wave equation 
in, 586 
Polar form, of complex numbers, 
613-618, 631 
Polar moment of inertia, of a region, 
429 
Poles (singularities), 714-715 
of order m, 735 
and zeros, 717 
Polynomials, 624 
characteristic, 325, 353, 877 
Chebyshev, 504 
interpolation, 808, 842 
Laguerre, 241, 504 
Legendre, 167, 177-179, 202 
orthogonal, 179 
trigonometric: 
approximation by, 495-498 
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Polynomials (Cont.) 
complex, 529 
of the same degree N, 495 
Polynomial approximations, 808 
Polynomial interpolation, 808, 842 
Polynomially bounded, 979 
Polynomial matrix, 334, 878-879 
Populations: 
infinite, 1044 
for statistical sampling, 1063 
Population dynamics: 
defined, 33 
logistic equation, 33-34 
Position vector, 356 
Positive correlation, 1111 
Positive definite (quadratic form), 
346 
Positive sense, on curve, 644 
Possible values (random variables), 
1030 
Postman problem, 980 
Potential (potential function), 400 
complex, 760-761 
Laplace’s equation, 593 
Poisson’s integral formula for, 
7771-781 
Potential theory, 179, 420, 460, 
758-786 
conformal mapping for boundary 
value problems, 763-767 
defined, 758 
electrostatic fields, 759-763 
complex potential, 760-761 
superposition, 761-762 
fluid flow, 771-777 
harmonic functions, 781-784 
heat problems, 767-770 
Laplace’s equation, 593, 628 
Poisson’s integral formula, 777-781 
Power function, of a test, 1081, 1113 
Power method (matrix eigenvalue 
problems), 885-888, 899 
Power series, 168, 671-707 
convergence behavior of, 680-682 
convergence tests, 674-676, 
A93-A94 
functions given by, 685-690 
Maclaurin series, 690 
in powers of x, 168 
radius of convergence, 682-684 
ratio test, 676-678 
root test, 678-679 
sequences, 671-673 
series, 673-674 
Taylor series, 690-697 
uniform convergence, 698-705 
and absolute convergence, 704 
properties of, 700-701 
termwise integration, 701-703 
test for, 703-704 


Power series method, 167-175, 201 
extension of, see Frobenius method 
idea and technique of, 168-170 
operations on, 173-174 
theory of, 170-174 

Practical resonance, 90 

Predator—prey population model, 

155-156 

Predictor—corrector method, 913 

PRIM, ALGORITHM, 989 

Prim, Robert Clay, 988n.6 

Prim’s algorithm, 988-991, 1008 

Principal axes, transformation to, 344 

Principal branch, of logarithm, 639 

Principal directions, 330 

Principal minors, 346 

Principal part, 735 
of isolated singularities, 715 
of singularities, 708, 709 

Principal value (complex numbers), 

614, 617, 642 
complex logarithm, 637 
general powers, 639 

Principle of deformation of path, 656 

Prior estimates, 805 

Probability, 1060 
axioms of, 1020 
basic theorems of, 1020-1022 
conditional, 1022-1023 
definitions of, 1018-1020 
independent events, 1023 

Probability distributions, 1029, 1061 
binomial, 1039-1042 
continuous, 1032-1034 
discrete, 1030-1032 
hypergeometric, 1042-1044 
mean and variance of, 1035-1039 
multinomial, 1045 
normal, 1045-1051 
Poisson, 1041-1042 
of several random variables, 

1051-1060 
addition of means, 1057-1058 
addition of variances, 
1058-1059 
continuous two-dimensional 
distributions, 1053 
discrete two-dimensional 
distributions, 1052-1053 
function of random variables, 
1056 
independence of random 
variables, 1055-1056 
marginal distributions, 
1053-1055 
symmetric, 1036 
two-dimensional, 1051 
continuous, 1053 
discrete, 1052-1053 
uniform, 1035-1036 


Probability function, 1030-1032, 
1052, 1061 
Probability theory, 1009, 1015-1062 
binomial coefficients, 1027-1028 
combinations, 1024, 1026-1027 
distributions (probability 
distributions), 1029 
binomial, 1039-1042 
continuous, 1032-1034 
discrete, 1030-1032 
hypergeometric, 1042-1044 
mean and variance of, 
1035-1039 
normal, 1045-1051 
Poisson, 1041-1042 
of several random variables, 
1051-1060 
events, 1016-1017 
experiments, 1015-1016 
factorial function, 1027 
outcomes, 1015 
permutations, 1024-1026 
probability: 
basic theorems of, 1020-1022 
conditional, 1022-1023 
definition of, 1018-1020 
independent events, 1023 
random variables, 1029-1030 
continuous, 1032-1034 
discrete, 1030-1032 
Problem of existence, 39 
Problem of uniqueness, 39 
Producers, 1092 
Producer’s risk, 1094 
Product: 
inner (dot), 312 
for complex vectors, 349 
invariance of, 336 
vector differential calculus, 
361-367, 410 
of matrix, 260 
determinants of, 307-308 
inverting, 306 
matrix multiplication, 263, 320 
parallel processing of (on 
computer), 265 
scalar multiplication, 260 
scalar triple, 373-374, 411 
vector (cross): 
in Cartesian coordinates, 
A83-A84 
vector differential calculus, 
368-375, 410 
Product method, 605. See also 
Method of separating variables 
Projection (vectors), 365 
Proper node, 143 
Pseudocode, 796 
Pure imaginary complex numbers, 
609 
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QR-factorization, 892-896 
Quadrant, of a circle, 604 
Quadratic forms (matrix eigenvalue 
problems), 343-344 
Quadratic interpolation, 
810-811 
Qualitative methods, 124, 
141n.1 
defined, 152 
for nonlinear systems, 152-160 
linearization, 152—155 
Lotka—Volterra population 
model, 155-156 
transformation to first-order 
equation in phase plane, 
157-159 
Quality control (statistics), 
1087-1092, 1113 
for mean, 1088-1089 
for range, 1090-1091 
for standard deviation, 1090 
for variance, 1089-1090 
Quantitative methods, 124 
Quasilinear equations, 555, 923 
Quotient: 
complex numbers, 610 
difference, 923 
Rayleigh, 885, 899 


Radius: 
of convergence, 172 
defined, 172 
power series, 682-684, 706 
of a graph, 991 
Random experiments, 1011, 
1015-1016, 1060 
Randomly selected samples, 1064 
Randomness, 1015, 1064. See also 
Random variables 
Random numbers, 1064 
Random number generators, 1064 
Random sampling (statistics), 
1063-1065 
Random selections, 1064 
Random variables, 1011, 1029-1030, 
1061 
continuous, 1029, 1032-1034, 
1055 
defined, 1030 
dependent, 1055 
discrete, 1029-1032, 1054 
function of, 1056 
independence of, 1055-1056 
marginal distribution of, 1054, 
1055 
normal, 1045 
occurrence of, 1063 
probability distributions of, 
1051-1060 


addition of means, 1057-1058 
addition of variances, 1058-1059 
continuous two-dimensional 

distributions, 1053 
discrete two-dimensional 
distributions, 1052-1053 
function of random variables, 
1056 
independence of random 
variables, 1055-1056 
marginal distributions, 
1053-1055 
skewness of, 1039 
standardized, 1037 
two-dimensional, 1051, 1062 
Random variation, 1063 
Range, 1013 
control chart for, 1090-1091 
defined, 1090 
of f, 620 
Rank: 
of A, 279 
of a matrix, 279, 283, 321 
in terms of column vectors, 
284-285 
in terms of determinants, 297 
of R, 279 

Raphson, Joseph, 801n.1 

Rational functions, 624, 725-729 

Ratio test (power series), 676-678 

Rayleigh, Lord (John William Strutt), 
160n.5, 885n.10 

Rayleigh equation, 160 

Rayleigh quotient, 885, 899 

Reactance (RLC circuits), 94 

Real axis (complex plane), 611 

Real different roots, 71 

Real double root, 55—56, 72 

Real functions, complex analytic 
functions vs., 694 

Real inner product space, 312 

Real integrals, residue integration of, 
725-733 

Fourier integrals, 729-730 

improper integrals, 730-732 

of rational functions of cos 0 
sin 0, 725-729 

Real part (complex numbers), 609 

Real pre-Hilbert space, 312 

Real roots: 

different, 71 
double, 55-56 
higher-order homogeneous linear 
ODEs: 
distinct, 112-113 
multiple, 114-115 
second-order homogeneous linear 
ODEs: 
distinct, 54—55 
double, 55-56 
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Real sequence, 671 
Real series, A73—A74 
Real vector spaces, 309-311, 359, 


410 
Recording, of sample values, 
1011-1012 


Rectangular cross-section, 120 
Rectangular matrix, 258 
Rectangular membrane R, 577-584 
Rectangular rule (numeric 
integration), 828 
Rectifiable (curves), 385 
Rectification (acceptance sampling), 
1094-1095 
Rectifying plane, 390 
Recurrence formula, 201 
Recurrence relation, 176 
Recursion formula, 176 
Reduced echelon form, 279 
Reduction of order (second-order 
homogeneous linear ODEs), 
51-52 
Regions, 426n.2 
bounded, 426n.2 
center of gravity of mass in, 429 
closed, 426n.2 
critical, 1079 
feasibility, 954 
fundamental (exponential 
function), 632 
moments of inertia of, 429 
polar moment of inertia of, 429 
rejection, 1079 
sets in complex plane, 620 
total mass of, 429 
volume of, 428 
Regression analysis, 1063, 
1103-1108, 1113 
confidence intervals in, 
1107-1108 
defined, 1103 
Regression coefficient, 1105, 
1107-1108 
Regression curve, 1103 
Regression line, 1103, 1104, 1106 
Regular point, 181 
Regular singular point, 180n.4 
Regular Sturm—Liouville problem, 
501 
Rejectable quality level (RQL), 1094 
Rejection: 
of a hypothesis, 1078 
of products, 1092 
Rejection region, 1079 
Relative class frequency, 1012 
Relative error, 794 
Relative frequency (probability): 
of an event, 1019 
class, 1012 
cumulative, 1012 
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Relaxation methods, 862 
Remainder, 170 
of a series, 673 
of Taylor series, 691 
Remarkable parallelogram, 375 
Removable singularities, 717 
Repeated factors, 220, 221 
Representation, 315 
by Fourier series, 476 
by power series, 683 
spectral, 525 
Residual, 805, 862, 899 
Residues, 708, 720, 735 
at mth-order pole, 722 
at simple poles, 721-722 
Residue integration, 719-733 
formulas for residues, 721-722 
of real integrals, 725-733 
Fourier integrals, 729-730 
improper integrals, 730-732 
of rational functions of cos 0 
sin 6, 725-729 
several singularities inside 
contour, 723-725 
Residue theorem, 723-724 
Resistance, apparent, 95 
Resonance: 
practical, 90 
undamped forced oscillations, 
88-89 
Resonance factor, 88 
Response to input, see Output 
(response to input) 
Resultant, of forces, 357 
Riccati equation, 35 
Riemann, Bernhard, 625n.4 
Riemannian geometry, 625n.4 
Riemann sphere, 718 
Riemann surfaces (conformal 
mapping), 754-757 
Right-hand derivatives (Fourier 
series), 480 
Right-handed Cartesian coordinate 
system, 368-369, A83—A84 
Right-handed triple, 369 
Right-hand limit (Fourier series), 480 
Right-sided tests, 1079, 1082 
Risks of making false decisions, 1080 
RKF method, see 
Runge—Kutta—Fehlberg method 
RK methods, see Runge-Kutta 
methods 
RKN methods, see 
Runge—Kutta—Nystr6m methods 
Robin problem: 
Laplace’s equation, 593 
two-dimensional heat equation, 564 
Rodrigues, Olinde, 179n.2 
Rodrigues’s formula, 179, 241 
Romberg integration, 840, 843 


Roots: 
complex: 
higher-order homogeneous 
linear ODEs, 113-115 
second-order homogeneous 
linear ODEs, 57-59 
complex conjugate, 72-73 
differing by an integer, 183 
Frobenius method, 183 
distinct (Frobenius method), 182 
double (Frobenius method), 183 
of equations, 798 
multiple complex, 115 
nth, 616 
nth roots of unity, 617 
simple complex, 113-114 
Root test (power series), 678-679 
Rotation (vorticity of flow), 774 
Rounding, 792 
Rounding unit, 793 
Roundoff (numeric analysis), 792-793 
Roundoff errors, 792, 794, 902 
Roundoff rule, 793 
Rows: 
determinants, 294 
matrix, 125, 257, 320 
Row echelon form, 279-280 
Row-equivalent matrices, 283-284 
Row-equivalent systems, 277 
Row operations (linear systems), 276, 
277 
Row scaling (Gauss elimination), 850 
Row “sum” norm, 861 
Row vectors, 126, 257, 320 
RQL (rejectable quality level), 1094 
Runge, Carl, 820n.3 
Runge, Karl, 905n.1 
RUNGE-KUTTA, ALGORITHM, 905 
Runge—Kutta—Fehlberg (RKF) 
method, 947 
error of, 908 
first-order ODEs, 906-908 
Runge-Kutta (RK) methods, 915, 947 
error of, 908 
first-order ODEs, 904—906 
higher order ODEs, 917-919 
Runge—Kutta—Nystr6m (RKN) 
methods, 919-921, 947 
Rutherford, E., 1044, 1100 
Rutherford—Geiger experiments, 
1044, 1100 
Rutishauser, Heinz, 892n.12 


Saddle point, 143, 165 
Samples: 
for experiments, 1015 
in mathematical statistics, 
1063-1064 
selection of, 1063-1064 


Sample covariance, 1105 
Sampled function, 529 
Sample distribution function, 1096 
Sample mean, 1064, 1113 
Sample points, 1015 
Sample regression line, 1104 
Sample size, 1015, 1064 
Sample space, 1015, 1016, 1060 
Sample standard deviation, 1065 
Sample variance, 1015, 1113 
Sampling: 
from a population, 1023 
random, 1063-1065 
with replacement, 1023 
binomial distribution, 1042 
hypergeometric distribution, 
1043-1044 
in statistics, 1063 
without replacement, 1018, 1023 
binomial distribution, 
1042-1043 
hypergeometric distribution, 
1043-1044 
Sampling plan, 1092-1093 
Scalar(s), 260, 310, 354 
Scalar fields, vector fields that are 
gradients of, 400-401 
Scalar functions: 
defined, 376 
vector differential calculus, 376 
Scalar matrices, 268 
Scalar multiplication, 126-127, 310 
of matrices and vectors, 259-261 
vectors in 2-space and 3-space, 
358-359 
Scalar triple product, 373-374, 411 
Scale (vectors), 886-887 
Scanning labeled vertices, 998 
Schrédinger, Erwin, 226n.2 
Schur, Issai, 882n.7 
Schur’s inequality, 882 
Schur’s theorem, 882 
Schwartz, Laurent, 226n.2 
Secant, formula for, A65 
Secant method (numeric analysis), 
805-806, 842 
Second boundary value problem, see 
Neumann problem 
Second-order determinants, 291-292 
Second-order differential operator, 60 
Second-order linear ODEs, 46-104 
homogeneous, 46-79 
basis, 50-52 
with constant coefficients, 
53-60 
differential operators, 60-62 
Euler—Cauchy equations, 
71-74 
existence and uniqueness of 
solutions, 74-79 
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Second-order linear ODEs (Cont.) 
general solution, 49-51, 77-78 
initial value problem, 49-50 
modeling free oscillations of 

mass-—spring system, 
62-70 
reduction of order, 51—52 
superposition principle, 47-48 
Wronskian, 75—78 
nonhomogeneous, 79-102 
defined, 47 
general solution, 80-81 
method of undetermined 
coefficients, 81-85 
modeling electric circuits, 93-99 
modeling forced oscillations, 
85-92 
solution by variation of 
parameters, 99-102 

Second-order method, improved 
Euler method as, 904 

Second-order nonlinear ODEs, 46 

Second-order PDEs, 540-541 

Second (second order) partial 
derivatives, A71 

Second shifting theorem (f-shifting), 
219-223 

Second transmission line equation, 
599 

Seidel, Philipp Ludwig von, 858n.4 

Self-starting methods, 911 

Sense reversal (complex line 
integrals), 645 

Separable equations, 12-13 

Separable ODEs, 44 

first-order, 12—20 
extended method, 17-18 
modeling, 13-17 
reduction of nonseparable ODEs 
to, 17-18 
Separating variables, method of, 
12-13 
circular membrane, 587 
partial differential equations, 
545-553, 605 
Fourier series, 548-551 
satisfying boundary conditions, 
546-548 
two ODEs from wave 
equation, 545-546 
vibrating string, 545-546 
Separation constant, 546 
Sequences (infinite sequences): 
bounded, A93—A95 
convergent, 507-508, 672 
divergent, 672 
limit point of, A93 
monotone real, A72—A73 
power series, 671-673 
real, 671 


Series, A73—A74 
binomial, 696 
conditionally convergent, 675 
convergent, 171, 673 
cosine, 781 
derived, 687 
divergent, 171, 673 
double Fourier: 
defined, 582 
rectangular membrane, 
577-585 
Fourier, 473-483, 538 
convergence and sum or, 
480-481 
derivation of Euler formulas, 
479-480 
double, 577-585 
even and odd functions, 
486-488 
half-range expansions, 488-490 
heat equation, 558-563 
from period 277 to 2L, 
483-486 
Fourier—Bessel, 506—507, 589 
Fourier cosine, 484, 486, 538 
Fourier—Legendre, 505-506, 
596-598 
Fourier sine, 477, 486, 538 
one-dimensional heat equation, 
561 
vibrating string, 548 
geometric, 168, 675 
Taylor series, 694 
uniformly convergent, 698 
hypergeometric, 186 
infinite, 673-674 
Laurent, 708-719, 734 
analytic or singular at infinity, 
718-719 
point at infinity, 718 
Riemann sphere, 718 
singularities, 715-717 
zeros of analytic functions, 717 
Maclaurin, 690, 694-696 
orthogonal, 504-510 
completeness, 508-509 
mean square convergence, 
507-508 
power, 168, 671-707 
convergence behavior of, 
680-682 
convergence tests, 674-676, 
A93-A94 
functions given by, 685-690 
Maclaurin series, 690 
in powers of x, 168 
radius of convergence, 
682-684 
ratio test, 676-678 
root test, 678-679 
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Series (Cont. ) 
sequences, 671-673 
series, 673-674 
Taylor series, 690-697 
uniform convergence, 698-705 
real, A73—A74 
Taylor, 690-697, 707 
trigonometric, 476, 484 
value (sum) of, 171, 673 
Series solutions of ODEs, 167-202 
Bessel functions, 187-188 
of the first kind, 189-194 
of the second kind, 196-200 
Bessel’s equation, 187-196 
Bessel functions, 187-188, 
196-200 
general solution, 194-200 
Frobenius method, 180-187 
indicial equation, 181-183 
typical applications, 183-185 
Legendre polynomials, 177-179 
Legendre’s equation, 175— 179 
power series method, 167-175 
idea and technique of, 
168-170 
operations on, 173-174 
theory of, 170-174 
Sets: 
complete orthonormal, 508 
in the complex plane, 620 
cut, 994-996, 1008 
linearly dependent, 129, 311 
linearly independent, 128-129, 
311 
Shewhart, W. A., 1088 
Shifted function, 219 
Shortest path, 976 
Shortest path problems 
(combinatorial optimization), 
975-980, 1008 
Bellman’s principle, 980-981 
complexity of algorithms, 
978-980 
Dijkstra’s algorithm, 981-983 
Moore’s BFS algorithm, 977—980 
Shortest spanning trees: 
combinatorial optimization, 1008 
Greedy algorithm, 984-988 
Prim’s algorithm, 988-991 
defined, 984 
Short impulses (Laplace transforms), 
225-226 
Sifting property, 226 
Significance (in statistics), 1078 
Significance level, 1078, 1080, 1113 
Significance tests, 1078 
Significant digits, 791-792 
Similarity transformation, 340 
Similar matrices, 340-341, 878 
Simple closed curves, 646 
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Simple closed path, 652 
Simple complex roots, 113-114 
Simple curves, 383 
Simple events, 1015 
Simple general properties of the line 
integral, 415-416 
Simple poles, 714 
Simplex method, 958-968 
degenerate feasible solution, 
962-965 
difficulties in starting, 965-968 
Simplex table, 960 
Simplex tableau, 960 
Simple zero, 717 
Simply connected domains, 423, 646, 
652, 653 
SIMPSON, ALGORITHM, 832 
Simpson, Thomas, 832n.4 
Simpson’s rule, 832, 843 
adaptive integration with, 835-836 
numeric integration, 831-835 
Simultaneous corrections, 862 
Sine function: 
conformal mapping by, 750-751 
formula for, A63—A65 
Sine integral, 514, 697, A68—-A69, A98 
Single precision, floating-point 
standard for, 792 
Singularities (singular, having a 
singularity), 693, 707, 715 
analytic functions, 693 
essential, 715-716 
inside a contour, 723-725 
isolated, 715 
isolated essential, 715 
Laurent series, 715-719 
principal part of, 708 
removable, 717 
Singular matrices, 301 
Singular point, 181, 201 
analytic functions, 693 
regular, 180n.4 
Singular solutions: 
first-order ODEs, 8, 35 
higher-order homogeneous linear 
ODEs, 110 
second-order homogeneous linear 
ODEs, 50, 78 
Singular Sturm—Liouville problem, 
501, 503 
Sink(s): 
motion of a fluid, 404, 458, 775, 
716 
networks, 991 
Size: 
of matrices, 258 
sample, 1015, 1064 
Skew-Hermitian form, 351 
Skew-Hermitian matrices, 347, 348, 
350, 353 


Skewness, of a random variables, 1039 
Skew-symmetric matrices, 268, 320, 
334-336, 353 
Slack variables, 956, 969 
Slope field (direction field), 9-10 
Smooth curves, 414, 644 
Smooth surfaces, 442 
Sobolev, Sergei L’ Vovich, 226n.2 
Software: 
for data representation in statistics, 
1011 
numeric analysis, 788-789 
variable step size selection in, 902 
Solenoid, 405 
Solutions. See also specific methods 
defined, 4, 798 
first-order ODEs: 
concept of, 4-6 
equilibrium solutions, 33-34 
explicit solutions, 21 
family of solutions, 5 
general solution, 6, 44 
implicit solutions, 21 
particular solution, 6, 44 
singular solution, 8, 35 
solution by calculus, 5 
trivial solution, 28, 35 
graphing in phase plane, 141-142 
higher-order homogeneous linear 
ODEs, 106 
general solution, 106, 110-111 
particular solution, 106 
singular solution, 110 
linear systems, 273, 745 
nonhomogeneous linear systems: 
general solution, 160 
particular solution, 160 
PDEs, 541 
second-order homogeneous linear 
ODEs: 
general solution, 49-51, 77-78 
linear dependence and 
independence of, 75 
particular solution, 49-51 
singular solution, 50, 78 
second-order linear ODEs, 47 
second-order nonhomogeneous 
linear ODEs: 
general solution, 80-81 
particular solution, 80 
systems of ODEs, 137, 139 
Solution curves, 4-6 
Solution space, 290 
Solution vector, 273, 745 
SOR (successive overrelaxation), 863 
SOR formula for Gauss-Seidel, 863 
Sorting, of sample values, 1011-1012 
Source(s): 
motion of a fluid, 404, 458, 775 
networks, 991 


Source intensity, 458 
Source line (flow modeling), 776 
Span, of vectors, 286 
Spanning trees, 984, 988 
Sparse graphs, 974 
Sparse matrices, 823, 925 
Sparse systems, 858 
Special functions, 167, 202 
formulas for, A63—A69 
theory of, 175 
Special vector spaces, 285-287 
Specific circulation, of flow, 467 
Spectral density, 525 
Spectral mapping theorem, 878 
Spectral radius, 324, 861 
Spectral representation, 525 
Spectral shift, 896 
Spectrum, 877 
of matrix, 324 
vibrating string, 547 
Speed, 386, 391 
angular (rotation), 372 
of convergence, 804-805 
Spherical coordinates, A74—A76 
boundary value problem in, 
594-596 
defined, 594 
Laplacian in, 594 
Spiral point, 144-145, 165 
Spline, 821, 843 
Spline interpolation, 820-827 
Spring constant, 62 
Square error, 496-497, 539 
Square matrices, 126, 257, 258, 


301-309, 320 
s-shifting, 208-209 
Stability: 


of critical points, 165 
of solutions, 33-34, 124, 936 
of systems, 84, 124 
Stability chart, 149 
Stable algorithms, 796, 842 
Stable and attractive critical points, 
140, 149 
Stable critical points, 140, 149 
Stable equilibrium solution, 33-34 
Stable systems, 84 
Stagnation points, 773 
Standard basis, 314, 359, 365 
Standard deviation, 1014, 1035, 1090 
Standard form: 
first-order ODEs, 27 
higher-order homogeneous linear 
ODEs, 105 
higher-order linear ODEs, 123 
power series method, 172 
second-order linear ODEs, 46, 
103 
Standardized normal distribution, 
1046 
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Standardized random variables, 1037 
Standard trick (confidence intervals), 
1068 
Stationary point (unconstrained 
optimization), 952 
Statistics, 1015, 1063. See also 
Mathematical statistics 
Statistical inference, 1059, 1063 
Steady flow, 405, 458 
Steady heat flow, 767 
Steady-state case (heat problems), 
591 
Steady-state current, 98 
Steady-state heat flow, 460 
Steady-state solution, 31, 84, 89-91 
Steady two-dimensional heat 
problems, 546-566, 605 
Steepest descent, method of, 952-954 
Steiner, Jacob, 451n.6 
Stem-and-leaf plots, 1012 
Stencil (pattern, molecule, star), 925 
Step-by-step methods, 901 
Step function, 828, 1031 
Step size, 901, 902 
Stereographic projection, 718 
Stiff ODEs, 909-910 
Stiff systems, 920-921 
Stirling, James, 1027n.2 
Stirling formula, 1027, A67 
Stochastic matrices, 270 
Stochastic variables, 1029. See also 
Random variables 
Stokes, Sir George Gabriel, 464n.9, 
703n.5 
Stokes’s Theorem, 463-470 
Stream function, 771 
Streamline, 771 
Strength (flow modeling), 776 
Strictly diagonally dominant matrices, 
881 
Sturm, Jacques Charles Francois, 
499n.4 
Sturm-—Liouville equation, 499 
Sturm-—Liouville expansions, 474 
Sturm-—Liouville Problems, 498-504 
eigenvalues, eigenfunctions, 
499-500 
orthogonal functions, 500-503 
Subgraphs, 972 
Submarine cable equations, 599 
Submatrices, 288 
Subsidiary equation, 203, 253 
Subspace, of vector space, 286 
Subtraction: 
of complex numbers, 610 
termwise, of power series, 687 
Success corrections, 862 
Successive overrelaxation (SOR), 863 
Sufficient convergence condition, 
861 


Sum: 
of matrices, 320 
partial, of series, 477, 478, 495 
of a series, 171, 673 
of vectors, 357 
Sum Rule (method of undetermined 
coefficients): 
higher-order homogeneous linear 
ODEs, 115 
second-order nonhomogeneous 
linear ODEs, 81, 83-84 
Superlinear convergence, 806 
Superposition (electrostatic fields), 
761-762 
Superposition (linearity) principle: 
higher-order homogeneous linear 
ODEs, 106 
higher-order linear ODEs, 123 
homogeneous linear systems, 
138 
PDEs, 541-542 
second-order homogeneous linear 
ODEs, 47-48, 104 
undamped forced oscillations, 87 
Surfaces, for surface integrals, 
439-443 
orientation of, 446-447 
representation of surfaces, 
439-441 
tangent plane and surface normal, 
441-442 
Surface integrals, 470 
defined, 443 
surfaces for, 439-443 
orientation of, 446-447 
representation of surfaces, 
439-441 
tangent plane and surface 
normal, 441-442 
vector integral calculus, 443-452 
orientation of surfaces, 
446-447 
without regard to orientation, 
448-450 
Surface normal, 398-399, 442 
Surface normal vector, 398-399 
Surjective mapping, 737n.1 
Sustainable yield, 36 
Symbol O, 979 
Symmetric coefficient matrix, 343 
Symmetric distributions, 1036 
Symmetric matrices, 267-268, 320, 
334-336, 353 
Systems of ODEs, 124-166 
basic theory of, 137-139 
constant-coefficient, 140-151 
critical points, 142-146, 
148-151 
graphing solutions in phase 
plane, 141-142 
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Systems of ODEs (Cont.) 
conversion of nth-order ODEs to, 
134-135 
homogeneous, 138 
Laplace transforms, 242-247 
linear, 138-139. See also Linear 
systems 
constant-coefficient systems, 
140-151 
matrices and vectors, 124-130 
nonhomogeneous, 160-163 
matrices and vectors, 124-130 
calculations with, 125-127 
definitions and terms, 
125-126, 128-129 
eigenvalues and eigenvectors, 
129-130 
systems of ODEs as vector 
equations, 127-128 
as models of applications: 
electrical network, 132-134 
mixing problem involving two 
tanks, 130-132 
nonhomogeneous, 138, 160-163 
method of undetermined 
coefficients, 161 
method of variation of 
parameters, 162-163 
nonlinear systems: 
qualitative methods for, 
152-160 
transformation to first-order 
equation in phase plane, 
157-159 
in phase plane, 124 
critical points, 142-146 
graphing solutions in, 141-142 
transformation to first-order 
equation in, 157-159 
qualitative methods for nonlinear 
systems, 152-160 
linearization, 152-155 
Lotka—Volterra population 
model, 155-156 


Tangent: 

to a curve, 384 

formula for, A65 
Tangent function, conformal mapping 

by, 752-753 

Tangential accelerations, 391 
Tangential acceleration vector, 387 
Tangent plane, 398, 441-442 
Tangent vector, 384, 411 
Target (networks), 991 
Taylor, Brook, 690n.2 
Taylor series, 690-697, 707 
Taylor’s formula, 691 
Taylor’s theorem, 691 
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t-distribution, 1071-1073, 1078, 
A103 
Telegraph equations, 599 
Term(s): 
of a sequence, 671 
of a series, 673 
Terminal point (vectors), 355 
Termination criterion, 802-803 
Termwise addition, 173, 687 
Termwise differentiation, 173, 
687-688, 703 
Termwise integration, 687, 688, 
701-703 
Termwise multiplication, 173, 687 
Termwise subtraction, 687 
Tests, statistical, 1077, 1113 
Theory of special functions, 175 
Thermal diffusivity, 460 
Third boundary value problem, see 
Robin problem 
Third-order determinants, 292-293 
Third (third order) partial derivatives, 
A7l 
3-space, vectors in, 309, 354 
components of a vector, 356-357 
scalar multiplication, 358-359 
vector addition, 357—359 
Three-sigma limits, 1047 
Time (curves in mechanics), 386 
TI-Nspire, 789 
Todd, John, 855n.3 
Tolerance (adaptive integration), 835 
Torricelli, Evangelista, 16n.4 
Torricelli’s law, 16-17 
Torsion, curvature and, 389-390 
Total differential, 20, 45 
Total energy, of physical system, 525 
Total error, 902 
Total mass, of a region, 429 
Total orthonormal set, 508 
Total pivoting, 846 
Trace, 345 
Trail (shortest path problems), 975 
closed trails, 975-976 
Euler trail, 980 
Trajectories, 134, 165 
linear systems, 141-142, 148 
nonlinear systems, 152 
Transcendental equations, 798 
Transducers, 98 
Transfer function, 214 
Transformation(s), 313 
orthogonal, 336 
to principal axes, 344 
Transient solution, 84, 89 
Transient-state solution, 31 
Translation (vectors), 355 
Transposition(s): 
of matrices or vectors, 128, 320 
in samples, 1101 


Trapezoidal rule, 828, 843 
error bounds and estimate for, 
829-831 
numeric integration, 828-831 
Trees (graphs), 984, 988. See also 
Shortest spanning trees 
Trials (experiments), 1011, 1015 
Triangle inequality, 363, 614-615 
Triangular form (Gauss elimination), 
846 
Triangular matrices, 268 
Tricomi, Francesco, 556n.2 
Tricomi equation, 555, 556 
Tridiagonalization (matrix eigenvalue 
problems), 888-892 
Tridiagonal matrices, 823, 888, 928 
Trigonometric analytic functions 
(conformal mapping), 750-754 
Trigonometric function, 633-635, 
642 
inverse, 640 
Taylor series, 695 
Trigonometric polynomials: 
approximation by, 495-498 
complex, 529 
of the same degree N, 495 
Trigonometric series, 476, 484 
Trigonometric system, 475, 479-480, 
538 
Trihedron, 390 
Triple integrals, 470 
defined, 452 
mean value theorem for, 456-457 
vector integral calculus, 452-458 
Triply connected domains, 653, 658, 
659 
Trivial solution, 28, 35 
homogeneous linear systems, 
290 
linear systems, 273 
Sturm-Liouville problem, 499 
Truncating, 794 
t-shifting, 219-223 
Tuning (vibrating string), 548 
Twisted curves, 383 
2-space (plane), vectors in, 354 
components of a vector, 356-357 
scalar multiplication, 358-359 
vector addition, 357-359 
2 X 2 matrix, 125 
Two-dimensional heat equation, 
564-566 
Two-dimensional normal distribution, 
1110 
Two-dimensional probability 
distributions: 
continuous, 1053 
discrete, 1052-1053 
Two-dimensional problems (potential 
theory), 759, 771 


Two-dimensional random variables, 
1051, 1062 

Two-dimensional wave equation, 
575-584, 586 

Two-sided alternative (hypothesis 
testing), 1079-1080 

Two-sided tests, 1079, 1082-1083 

Type I errors, 1080, 1081 

Type II errors, 1080-1081 


UCL (upper control limit), 1088 
Unacceptable lots, 1094 
Unconstrained optimization, 969 
basic concepts, 951-952 
method of steepest descent, 
952-954 
Uncorrelated related variables, 1109 
Underdamping, 65, 67 
Underdetermined linear systems, 277 
Underflow (floating-point numbers), 
792 
Undetermined coefficients, method of: 
higher-order homogeneous linear 
ODEs, 115 
higher-order linear ODEs, 123 
nonhomogeneous linear systems 
of ODEs, 161 
second-order linear ODEs: 
homogeneous, 104 
nonhomogeneous, 81-85 
Uniform convergence: 
and absolute convergence, 704 
power series, 698-705 
properties of uniform 
convergence, 700-701 
termwise integration, 701-703 
test for, 703-704 
Uniform distributions, 1035-1036, 
1053 
Unifying power of mathematics, 97 
Union, of events, 1016-1017 
Uniqueness: 
of Laplace transforms, 210 
of Laurent series, 712 
of power series representation, 
685-686 
problem of, 39 
Uniqueness theorems: 
cubic splines, 822 
Dirichlet problem, 462, 784 
first-order ODEs, 39-42 
higher-order homogeneous linear 
ODEs, 108 
Laplace’s equation, 462 
linear systems, 138 
proof of, A77—A79 
second-order homogeneous linear 
ODEs, 74 
systems of ODEs, 137 
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nitary matrices, 347-350, 353 

nitary systems, 349 

nitary transformation, 349 

nit binormal vector, 389 

nit circle, 617, 619 

nit impulse function, 226. See also 
Dirac delta function 

nit matrices, 128, 268 

nit normal vectors, 366, 441 

nit principal normal vector, 389 

nit step function (Heaviside 
function), 217-219 

nit tangent vector, 384 

nit vectors, 312, 355 

niversal gravitational constant, 63 

nknowns, 257 

nrepeated factors, 220-221 

nstable algorithms, 796 

nstable critical points, 140, 149 

nstable equilibrium solution, 
33-34 

nstable systems, 84 

pper bound, for flows, 995 

pper confidence limits, 1068 

pper control limit (UCL), 1088 

Jpper triangular matrices, 268 
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Value (sum) of series, 171, 673 


Vandermonde, Alexandre Théophile, 


113n.1 
Vandermonde determinant, 113 
Van der Pol, Balthasar, 158n.4 
Van der Pol equation, 158-160 
Variables: 
artificial, 965-968 
basic, 960 
complex, 620-621 
control, 951 
controlled, 1103 
dependent, 393, 1055, 1056 
independent, 393, 1103 
intermediate, 393 
linearly, 1109 
nonbasic, 960 
random, 1011, 1029-1030, 1061 
continuous, 1029, 1032-1034, 
1055 
defined, 1030 
dependent, 1055 
discrete, 1029-1032, 1054 
function of, 1056 
independence of, 1055-1056 


marginal distribution of, 1054, 


1055 
normal, 1045 
occurrence of, 1063 
probability distributions of, 
1051-1060 
skewness of, 1039 


Variables: (Cont.) 
standardized, 1037 
two-dimensional, 1051, 1062 

slack, 956, 969 

stochastic, 1029 

uncorrelated related, 1109 

Variable coefficients: 

Frobenius method, 180-187 
indicial equation, 181-183 
typical applications, 

183-185 
Laplace transforms ODEs with, 
240-241 
power series method, 167-175 
idea and technique of, 
168-170 
operations on, 173-174 
theory of, 170-174 
second-order homogeneous linear 
ODEs, 73 
Variance(s), 1014, 1061 

comparison of, 1086 

control chart for, 1089-1090 

equality of, 1084n.3 

of normal distributions, 

confidence intervals for, 
1073-1076 
of probability distributions, 
1035-1039 
addition of, 1058-1059 
transformation of, 1036-1037 
sample, 1015 
Variation, random, 1063 
Variation of parameters, method of: 
higher-order linear ODEs, 123 
high-order nonhomogeneous linear 
ODEs, 118-120 
nonhomogeneous linear systems 
of ODEs, 162-163 
second-order linear ODEs: 
homogeneous, 104 
nonhomogeneous, 99-102 
Vectors, 256, 259 
addition and scalar multiplication 
of, 259-261 
calculations with, 126-127 
definitions and terms, 126, 
128-129, 257, 259, 309 
eigenvalues, 129-130 
eigenvectors, 129-130 
linear independence and 
dependence of, 282-283 

multiplying matrices by, 
263-265 

in the plane, 309, 355 

systems of ODEs as vector 
equations, 127-128 

in 3-space, 309 

transposition of, 266-267 

Vector addition, 309, 357-359 
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Vector calculus, 354, 378-380 


differential, see Vector differential 
calculus 

integral, see Vector integral 
calculus 


Vector differential calculus, 354-412 


curves, 381-392 
arc length of, 385-386 
length of, 385 
in mechanics, 386-389 
tangents to, 384-385 
and torsion, 389-390 
gradient of a scalar field, 395-402 
directional derivatives, 
396-397 
maximum increase, 398 
as surface normal vector, 
398-399 
vector fields that are, 
400-401 
inner product (dot product), 
361-367 
applications, 364-366 
orthogonality, 361-363 
scalar functions, 376 
and vector calculus, 378-380 
vector fields, 377-378 
curl of, 406-409 
divergence of, 402-406 
that are gradients of scalar 
fields, 400-401 
vector functions, 375-376 
partial derivatives of, 380 
of several variables, 392—395 
vector product (cross product), 
368-375 
applications, 371-372 
scalar triple product, 373-374 
vectors in 2-space and 3-space: 
components of a vector, 
356-357 
scalar multiplication, 358-359 
vector addition, 357-359 


Vector fields: 


defined, 376 
vector differential calculus, 
377-378 
curl of, 406-409, 412 
divergence of, 402-406 
that are gradients of scalar 
fields, 400-401 


Vector functions: 


continuous, 378-379 

defined, 375-376 

differentiable, 379 

divergence theorem of Gauss, 

453-457 

of several variables, 392—395 
chain rules, 392—394 
mean value theorem, 395 
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Vector functions: (Cont.) 
vector differential calculus, 
375-376, 411 
partial derivatives of, 380 
of several variables, 392-395 
Vectors in 2-space and 3-space: 
components of a vector, 
356-357 
scalar multiplication, 358-359 
vector addition, 357-359 
Vector integral calculus, 413-471 
divergence theorem of Gauss, 
453-463 
double integrals, 426-432 
applications of, 428-429 
change of variables in, 
429-431 
evaluation of, by two 
successive integrations, 
427-428 
Green’s theorem in the plane, 
433-438 
line integrals, 413-419 
definition and evaluation of, 
414-416 
path dependence of, 418-426 
work done by a force, 416-417 
path dependence of line integrals, 
418-426 
defined, 418 
and integration around closed 
curves, 421-425 
Stokes’s Theorem, 463-469 
surface integrals, 443-452 
orientation of surfaces, 
446-447 
without regard to orientation, 
448-450 
surfaces for surface integrals, 
439-443 
representation of surfaces, 
439-441 


Vector integral calculus (Cont.) 
tangent plane and surface 
normal, 441-442 
triple integrals, 452-458 
Vector moment, 371 
Vector norms, 866 
Vector product (cross product): 
in Cartesian coordinates, 
A83-A84 
vector differential calculus, 
368-375, 410 
applications, 371-372 
scalar triple product, 373-374 
Vector spaces, 482 
complex, 309-310, 349 
inner product spaces, 311-313 
linear transformations, 313-317 
real, 309-311 
special, 285-287 
Velocity, 391, 411, 771 
Velocity potential, 771 
Velocity vector, 386, 771 
Venn, John, 1017n.1 
Venn diagrams, 1017 
Verhulst, Pierre-Frangois, 32n.8 
Verhulst equation, 32-33 
Vertices (graphs), 971, 977, 1007 
adjacent, 971, 977 
central, 991 
coloring, 1005-1006 
double labeling of, 986 
eccentricity of, 991 
exposed, 1001, 1003 
four-color theorem, 1006 
scanning, 998 
Vertex condition, 991 
Vertex incidence list (graphs), 973 
Volta, Alessandro, 93n.7 
Voltage drop, 29 
Volterra, Vito, 155n.3, 198n.7, 236n.3 
Volterra integral equations, of the 
second kind, 236-237 


Volume, of a region, 428 
Vortex (fluid flow), 777 
Vorticity, 774 


Walk (shortest path problems), 975 
Wave equation, 544-545, 942 
d’Alembert’s solution, 553-556 
numeric analysis, 942-944, 948 
one-dimensional, 544-545 
solution by separating variables, 
545-553 
two-dimensional, 575-584 
Weber’s equation, 510 
Weber’s functions, 198n.7 
Weierstrass, Karl, 625n.4, 703n.5 
Weierstrass approximation theorem, 
809 
Weierstrass M-test for uniform 
convergence, 703-704 
Weighted graphs, 976 
Weight function, 500 
Well-conditioned problems, 864 
Well-conditioning (linear systems), 
865 
Wessel, Caspar, 611n.2 
Work done by a force, 416-417 
Work integral, 415 
Wronski, Josef Maria Héne, 76n.5 
Wronskian (Wronski determinant): 
second-order homogeneous linear 
ODEs, 75-78 
systems of ODEs, 139 


Zeros, of analytic functions, 717 
Zero matrix, 260 

Zero surfaces, 598 

Zero vector, 129, 260, 357 
z-score, 1014 
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Some Constants 


e = 2.71828 18284 59045 23536 
Ve = 1.64872 12707 00128 14685 
e” = 7.38905 60989 30650 22723 


T = 3.14159 26535 89793 23846 
m” = 9.86960 44010 89358 61883 
Var = 1.77245 38509 05516 02730 


logy9 7 = 0.49714 98726 94133 85435 
In 7 = 1.14472 98858 49400 17414 
logy € = 0.43429 44819 03251 82765 
In 10 = 2.30258 50929 94045 68402 


V2. = 1.41421 35623 73095 04880 
W2 = 1.25992 10498 94873 16477 
V3 = 1.73205 08075 68877 29353 
W3 = 1.44224 95703 07408 38232 
In 2 = 0.69314 71805 59945 30942 
In 3 = 1.09861 22886 68109 69140 


y= 0.57721 56649 01532 86061 
In y = —0.54953 93129 81644 82234 
(see Sec. 5.6) 
1° = 0.01745 32925 19943 29577 rad 
1 rad = 57.29577 95130 82320 87680° 
= 57°17'44.806" 


Polar Coordinates 


x = rcos 6 y =rsin @ 
y 
r=Ve+y¥ tan 9 = — 
x 
dx dy = rdrdé 
Series 
1 co 
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7 m=0 
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=> a 
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7 oo (1a 
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oo x™ 
n(dl-x=-> (lx| < D) 
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arctan x = > 
=0 


< 
mri (asd 


Greek Alphabet 


a Alpha v Nu 

B Beta é Xi 

y,T Gamma O Omicron 

6, A Delta 7 Pi 

eé,€ Epsilon p Rho 
c Zeta o,> Sigma 
nN Eta T Tau 

0, 3,O Theta v, Y Upsilon 

L Tota d, g, ® Phi 
K Kappa Xx Chi 

A, A. Lambda uw, V Psi 


bh Mu w, 2 Omega 


Vectors 


actb = ayb, + dgbo + ashe 


i j k 
axb= ay ag ag 
by by bs 
of of of 
dpa Veo isp 4k 
gad —Wa it its, 


divv = Vev= + + 


i j 
0 te) 
culv=VxXv= 
x oy Oz 
Uy Ug U3 


